From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: igt-dev@lists.freedesktop.org
Subject: Re: [PATCH i-g-t 08/37] lib/rendercopy: Fix fastclear scaling
Date: Thu, 29 Aug 2024 18:11:13 +0300 [thread overview]
Message-ID: <ZtCPkVu0wx9nX2Sx@intel.com> (raw)
In-Reply-To: <548a76d8-dd30-4a3b-a37a-47daec6b4475@gmail.com>
On Tue, Aug 27, 2024 at 06:17:33PM +0300, Juha-Pekka Heikkila wrote:
> On 3.7.2024 2.27, Ville Syrjala wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > The hardcoded 64x16 fastclear coordinate scaling
> > factors assume 32bpp+Y-tile. Determine the correct
> > scaling factors for other tilings and bpps.
> >
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> > lib/rendercopy_gen9.c | 105 +++++++++++++++++++++++++++++++++++++++---
> > 1 file changed, 99 insertions(+), 6 deletions(-)
> >
> > diff --git a/lib/rendercopy_gen9.c b/lib/rendercopy_gen9.c
> > index 57b64dad1b1d..42a227916f15 100644
> > --- a/lib/rendercopy_gen9.c
> > +++ b/lib/rendercopy_gen9.c
> > @@ -346,6 +346,95 @@ gen8_fill_ps(struct intel_bb *ibb,
> > return intel_bb_copy_data(ibb, kernel, size, 64);
> > }
> >
> > +static void fast_clear_scale(const struct intel_buf *buf,
> > + int *x_scale, int *y_scale)
> > +{
> > + switch (buf->tiling) {
> > + case I915_TILING_4:
> > + *x_scale = 1024 * 8 / buf->bpp;
>
> I was trying to figure where 1024 is coming from but fell short, maybe
> some comment could be added for this magic. Otherwise patch look ok.
It's just the required alignment for 8bpp. For tile4 and tile-y
we can simply divide that down to get the alignment for higher
bpps. The magic numbers are listed in bspec:47709.
>
> /Juha-Pekka
>
> > + *y_scale = 16;
> > + break;
> > + case I915_TILING_64:
> > + switch (buf->bpp) {
> > + case 8:
> > + *x_scale = 128;
> > + *y_scale = 128;
> > + break;
> > + case 16:
> > + *x_scale = 128;
> > + *y_scale = 64;
> > + break;
> > + case 32:
> > + *x_scale = 64;
> > + *y_scale = 64;
> > + break;
> > + case 64:
> > + *x_scale = 64;
> > + *y_scale = 32;
> > + break;
> > + case 128:
> > + *x_scale = 32;
> > + *y_scale = 32;
> > + break;
> > + }
> > + break;
> > + case I915_TILING_Y:
> > + *x_scale = 256 * 8 / buf->bpp;
> > + *y_scale = 16;
> > + break;
> > + case I915_TILING_Yf:
> > + switch (buf->bpp) {
> > + case 8:
> > + *x_scale = 128;
> > + *y_scale = 32;
> > + break;
> > + case 16:
> > + *x_scale = 128;
> > + *y_scale = 16;
> > + break;
> > + case 32:
> > + *x_scale = 64;
> > + *y_scale = 16;
> > + break;
> > + case 64:
> > + *x_scale = 64;
> > + *y_scale = 8;
> > + break;
> > + case 128:
> > + *x_scale = 32;
> > + *y_scale = 8;
> > + break;
> > + }
> > + break;
> > + case I915_TILING_Ys:
> > + switch (buf->bpp) {
> > + case 8:
> > + *x_scale = 64;
> > + *y_scale = 64;
> > + break;
> > + case 16:
> > + *x_scale = 64;
> > + *y_scale = 32;
> > + break;
> > + case 32:
> > + *x_scale = 32;
> > + *y_scale = 32;
> > + break;
> > + case 64:
> > + *x_scale = 32;
> > + *y_scale = 16;
> > + break;
> > + case 128:
> > + *x_scale = 16;
> > + *y_scale = 16;
> > + break;
> > + }
> > + break;
> > + default:
> > + igt_assert(0);
> > + }
> > +}
> > +
> > /*
> > * gen7_fill_vertex_buffer_data populate vertex buffer with data.
> > *
> > @@ -360,6 +449,7 @@ static uint32_t
> > gen7_fill_vertex_buffer_data(struct intel_bb *ibb,
> > const struct intel_buf *src,
> > uint32_t src_x, uint32_t src_y,
> > + const struct intel_buf *dst,
> > uint32_t dst_x, uint32_t dst_y,
> > uint32_t width, uint32_t height)
> > {
> > @@ -384,17 +474,21 @@ gen7_fill_vertex_buffer_data(struct intel_bb *ibb,
> > emit_vertex_normalized(ibb, src_x, intel_buf_width(src));
> > emit_vertex_normalized(ibb, src_y, intel_buf_height(src));
> > } else {
> > - emit_vertex_2s(ibb, DIV_ROUND_UP(dst_x + width, 64), DIV_ROUND_UP(dst_y + height, 16));
> > + int x_scale, y_scale;
> > +
> > + fast_clear_scale(dst, &x_scale, &y_scale);
> > +
> > + emit_vertex_2s(ibb, DIV_ROUND_UP(dst_x + width, x_scale), DIV_ROUND_UP(dst_y + height, y_scale));
> >
> > emit_vertex_normalized(ibb, 0, 0);
> > emit_vertex_normalized(ibb, 0, 0);
> >
> > - emit_vertex_2s(ibb, dst_x/64, DIV_ROUND_UP(dst_y + height, 16));
> > + emit_vertex_2s(ibb, dst_x/x_scale, DIV_ROUND_UP(dst_y + height, y_scale));
> >
> > emit_vertex_normalized(ibb, 0, 0);
> > emit_vertex_normalized(ibb, 0, 0);
> >
> > - emit_vertex_2s(ibb, dst_x/64, dst_y/16);
> > + emit_vertex_2s(ibb, dst_x/x_scale, dst_y/y_scale);
> >
> > emit_vertex_normalized(ibb, 0, 0);
> > emit_vertex_normalized(ibb, 0, 0);
> > @@ -1108,9 +1202,8 @@ void _gen9_render_op(struct intel_bb *ibb,
> > ps_binding_table = gen8_bind_surfaces(ibb, src, dst);
> > ps_sampler_state = gen8_create_sampler(ibb);
> > ps_kernel_off = gen8_fill_ps(ibb, ps_kernel, ps_kernel_size);
> > - vertex_buffer = gen7_fill_vertex_buffer_data(ibb, src,
> > - src_x, src_y,
> > - dst_x, dst_y,
> > + vertex_buffer = gen7_fill_vertex_buffer_data(ibb, src, src_x, src_y,
> > + dst, dst_x, dst_y,
> > width, height);
> > cc.cc_state = gen6_create_cc_state(ibb);
> > cc.blend_state = gen8_create_blend_state(ibb);
--
Ville Syrjälä
Intel
next prev parent reply other threads:[~2024-08-29 15:11 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-02 23:27 [PATCH i-g-t 00/37] Intel CCS + 10bpc/fp16 stuff Ville Syrjala
2024-07-02 23:27 ` [PATCH i-g-t 01/37] lib/intel_aux_pgtable: Library to add support for RGB16161616_64B format Ville Syrjala
2024-08-27 15:16 ` Juha-Pekka Heikkila
2024-08-28 13:03 ` Ville Syrjälä
2024-07-02 23:27 ` [PATCH i-g-t 02/37] lib/rendercopy: Add deltas to all surface relocs Ville Syrjala
2024-07-02 23:27 ` [PATCH i-g-t 03/37] tests/kms_big_fb: Use igt_fb_create_intel_buf() Ville Syrjala
2024-07-02 23:27 ` [PATCH i-g-t 04/37] tests/kms_frontbuffer_tracking: Use igt_create_fb() Ville Syrjala
2024-08-27 15:17 ` Juha-Pekka Heikkila
2024-07-02 23:27 ` [PATCH i-g-t 05/37] lib/igt_fb: Make igt_calc_fb_size() somewhat usable Ville Syrjala
2024-07-02 23:27 ` [PATCH i-g-t 06/37] lib/rendercopy: Always setup clear color for TGL Ville Syrjala
2024-07-02 23:27 ` [PATCH i-g-t 07/37] lib/rendercopy: Don't skip clearcolor on flat CCS Ville Syrjala
2024-08-27 15:16 ` Juha-Pekka Heikkila
2024-07-02 23:27 ` [PATCH i-g-t 08/37] lib/rendercopy: Fix fastclear scaling Ville Syrjala
2024-08-27 15:17 ` Juha-Pekka Heikkila
2024-08-29 15:11 ` Ville Syrjälä [this message]
2024-07-02 23:27 ` [PATCH i-g-t 09/37] lib/rendercopy: Extract gen4_surface_format() Ville Syrjala
2024-08-27 15:22 ` Juha-Pekka Heikkila
2024-07-02 23:27 ` [PATCH i-g-t 10/37] lib/rendercopy: Extract {dg2, lnl}_compression_format() Ville Syrjala
2024-08-27 15:24 ` Juha-Pekka Heikkila
2024-07-02 23:27 ` [PATCH i-g-t 11/37] lib/rendercopy: Add specific support for 2:10:10:10 formats Ville Syrjala
2024-08-27 16:03 ` Juha-Pekka Heikkila
2024-07-02 23:27 ` [PATCH i-g-t 12/37] lib/rendercopy: Use the proper compression format for 10bpc on dg2/lnl+ Ville Syrjala
2024-08-27 16:04 ` Juha-Pekka Heikkila
2024-07-02 23:27 ` [PATCH i-g-t 13/37] lib/rendercopy: Use the proper compression format for 16bpc " Ville Syrjala
2024-08-27 16:25 ` Juha-Pekka Heikkila
2024-07-02 23:27 ` [PATCH i-g-t 14/37] lib/igt_fb: Extract is_gen12_rc_ccs_cc_modifier() Ville Syrjala
2024-08-27 16:26 ` Juha-Pekka Heikkila
2024-07-02 23:27 ` [PATCH i-g-t 15/37] lib/igt_fb: Extract ccs_needs_enginecopy() Ville Syrjala
2024-08-27 16:34 ` Juha-Pekka Heikkila
2024-07-02 23:27 ` [PATCH i-g-t 16/37] lib/igt_fb: Require enginecopy for clear color Ville Syrjala
2024-08-27 16:37 ` Juha-Pekka Heikkila
2024-07-02 23:27 ` [PATCH i-g-t 17/37] lib/igt_fb: Expose igt_fb_is_ccs_modifier() Ville Syrjala
2024-08-27 16:43 ` Juha-Pekka Heikkila
2024-07-02 23:27 ` [PATCH i-g-t 18/37] lib/igt_fb: Expose igt_fb_is_gen12_rc_ccs_cc_modifier() Ville Syrjala
2024-08-27 16:45 ` Juha-Pekka Heikkila
2024-07-02 23:27 ` [PATCH i-g-t 19/37] lib/igt_fb: Expose igt_fb_is_gen12_mc_ccs_modifier() Ville Syrjala
2024-08-27 16:46 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 20/37] lib/igt_fb: Adjust how we pick the blitter compression format Ville Syrjala
2024-08-27 16:50 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 21/37] lib/igt_fb: Add DRM_FORMAT_XRGB2101010 compression format for the Ville Syrjala
2024-08-27 16:52 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 22/37] lib/igt_fb: Add 16bpc compression format for the blitter Ville Syrjala
2024-08-27 16:53 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 23/37] lib/igt_fb: Fix planar block copy Ville Syrjala
2024-08-27 16:55 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 24/37] lib/igt_fb: Fix blitter compression format handling Ville Syrjala
2024-08-27 17:00 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 25/37] lib/igt_fb: Try to fix block copy media compression handling Ville Syrjala
2024-08-27 17:04 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 26/37] lib/igt_fb: Assert that we have no clear color when using the bltter Ville Syrjala
2024-08-27 17:09 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 27/37] lib/vebox: Add support for fp16 RGB formats Ville Syrjala
2024-08-27 17:20 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 28/37] lib/vebox: Add 10bpc support Ville Syrjala
2024-08-27 17:39 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 29/37] lib/igt_fb: Treat 2:10:10:10 properly Ville Syrjala
2024-08-27 17:41 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 30/37] tests/kms_plane: Extract skip_format_mod() Ville Syrjala
2024-08-27 17:44 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 31/37] tests/kms_plane: Skip 10bpc formats with media compression Ville Syrjala
2024-08-27 17:46 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 32/37] tests/kms_ccs: Reuse igt_fb_is_gen12_rc_ccs_cc_modifier() Ville Syrjala
2024-08-27 17:47 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 33/37] tests/kms_ccs: Correctly check clear color for 10bpc formats Ville Syrjala
2024-08-27 17:48 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 34/37] tests/kms_ccs: Correctly check clear color for fp16 formats Ville Syrjala
2024-07-02 23:28 ` [PATCH i-g-t 35/37] tests/kms_ccs: Skip 10bpc formats with media compression Ville Syrjala
2024-07-02 23:28 ` [PATCH i-g-t 36/37] tests/kms_ccs: Skip testing on identical plane types Ville Syrjala
2024-08-27 17:50 ` Juha-Pekka Heikkila
2024-07-02 23:28 ` [PATCH i-g-t 37/37] tests/kms_ccs: Provide a hint as to what we're testing Ville Syrjala
2024-08-27 17:51 ` Juha-Pekka Heikkila
2024-07-03 12:55 ` ✓ CI.xeBAT: success for Intel CCS + 10bpc/fp16 stuff Patchwork
2024-07-03 13:10 ` ✗ Fi.CI.BAT: failure " Patchwork
2024-07-03 13:17 ` Ville Syrjälä
2024-07-03 15:31 ` ✓ CI.xeFULL: success " Patchwork
2024-07-05 16:38 ` ✓ Fi.CI.BAT: success for Intel CCS + 10bpc/fp16 stuff (rev2) Patchwork
2024-07-05 16:43 ` ✓ CI.xeBAT: " Patchwork
2024-07-05 20:04 ` ✓ CI.xeFULL: " Patchwork
2024-07-06 19:07 ` ✗ Fi.CI.IGT: failure " Patchwork
2024-07-13 13:29 ` ✓ CI.xeBAT: success for Intel CCS + 10bpc/fp16 stuff (rev3) Patchwork
2024-07-13 13:39 ` ✓ Fi.CI.BAT: " Patchwork
2024-07-13 14:22 ` ✗ CI.xeFULL: failure " Patchwork
2024-07-15 7:46 ` ✓ Fi.CI.IGT: success " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZtCPkVu0wx9nX2Sx@intel.com \
--to=ville.syrjala@linux.intel.com \
--cc=igt-dev@lists.freedesktop.org \
--cc=juhapekka.heikkila@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.