On Wed, Dec 06, 2023 at 10:01:20AM +0530, Himal Prasad Ghimiray wrote:- The XY_CTRL_SURF_COPY_BLT instruction operationg on ccs data expects size in pages of main memory for which CCS data should be copied. - The bitfield representing copy size in XY_CTRL_SURF_COPY_BLT has shifted one bit higher in the instruction. v2: - Fix the num_pages for ccs size calculation. - Address nits (Thomas) Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> --- drivers/gpu/drm/xe/regs/xe_gpu_commands.h | 1 + drivers/gpu/drm/xe/xe_migrate.c | 21 +++++++++++++++------ 2 files changed, 16 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/xe/regs/xe_gpu_commands.h b/drivers/gpu/drm/xe/regs/xe_gpu_commands.h index 1f9c32e694c6..6ff6c7aaa648 100644 --- a/drivers/gpu/drm/xe/regs/xe_gpu_commands.h +++ b/drivers/gpu/drm/xe/regs/xe_gpu_commands.h @@ -13,6 +13,7 @@ #define DST_ACCESS_TYPE_SHIFT 20 #define CCS_SIZE_MASK 0x3FF #define CCS_SIZE_SHIFT 8 +#define XE2_CCS_SIZE_SHIFT 9It would be preferred if we just removed the shifts and used FIELD_PREP() to put values into proper field masks instead.#define XY_CTRL_SURF_MOCS_MASK GENMASK(31, 26) #define XE2_XY_CTRL_SURF_MOCS_INDEX_MASK GENMASK(31, 28) #define NUM_CCS_BYTES_PER_BLOCK 256 diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c index e8b567708ac0..7ef068451b59 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c @@ -526,21 +526,30 @@ static void emit_copy_ccs(struct xe_gt *gt, struct xe_bb *bb, struct xe_device *xe = gt_to_xe(gt); u32 *cs = bb->cs + bb->len; u32 num_ccs_blks; + u32 num_pages; + u32 ccs_copy_size; u32 mocs; - num_ccs_blks = DIV_ROUND_UP(xe_device_ccs_bytes(gt_to_xe(gt), size), - NUM_CCS_BYTES_PER_BLOCK); - xe_gt_assert(gt, num_ccs_blks <= NUM_CCS_BLKS_PER_XFER); + if (GRAPHICS_VERx100(xe) >= 2000) { + num_pages = DIV_ROUND_UP(size, XE_PAGE_SIZE); + xe_gt_assert(gt, num_pages <= 1024);Rather than using a magic number here, FIELD_FIT() might be better. We can do the same for the other branch as well and get rid of NUM_CCS_BLKS_PER_XFER too.
Sure. Using FIELD_FIT is much better. Will update in next version.
BR
Himal
Matt- if (GRAPHICS_VERx100(xe) >= 2000) + ccs_copy_size = ((num_pages - 1) & CCS_SIZE_MASK) << XE2_CCS_SIZE_SHIFT; mocs = FIELD_PREP(XE2_XY_CTRL_SURF_MOCS_INDEX_MASK, gt->mocs.uc_index); - else + + } else { + num_ccs_blks = DIV_ROUND_UP(xe_device_ccs_bytes(gt_to_xe(gt), size), + NUM_CCS_BYTES_PER_BLOCK); + xe_gt_assert(gt, num_ccs_blks <= NUM_CCS_BLKS_PER_XFER); + + ccs_copy_size = ((num_ccs_blks - 1) & CCS_SIZE_MASK) << CCS_SIZE_SHIFT; mocs = FIELD_PREP(XY_CTRL_SURF_MOCS_MASK, gt->mocs.uc_index); + } *cs++ = XY_CTRL_SURF_COPY_BLT | (src_is_indirect ? 0x0 : 0x1) << SRC_ACCESS_TYPE_SHIFT | (dst_is_indirect ? 0x0 : 0x1) << DST_ACCESS_TYPE_SHIFT | - ((num_ccs_blks - 1) & CCS_SIZE_MASK) << CCS_SIZE_SHIFT; + ccs_copy_size; *cs++ = lower_32_bits(src_ofs); *cs++ = upper_32_bits(src_ofs) | mocs; *cs++ = lower_32_bits(dst_ofs); -- 2.25.1