On 07-12-2023 04:52, Matt Roper wrote: > On Wed, Dec 06, 2023 at 10:01:20AM +0530, Himal Prasad Ghimiray wrote: >> - The XY_CTRL_SURF_COPY_BLT instruction operationg on ccs data expects >> size in pages of main memory for which CCS data should be copied. >> - The bitfield representing copy size in XY_CTRL_SURF_COPY_BLT has >> shifted one bit higher in the instruction. >> >> v2: >> - Fix the num_pages for ccs size calculation. >> - Address nits (Thomas) >> >> Reviewed-by: Thomas Hellström >> Cc: Thomas Hellström >> Signed-off-by: Himal Prasad Ghimiray >> --- >> drivers/gpu/drm/xe/regs/xe_gpu_commands.h | 1 + >> drivers/gpu/drm/xe/xe_migrate.c | 21 +++++++++++++++------ >> 2 files changed, 16 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/gpu/drm/xe/regs/xe_gpu_commands.h b/drivers/gpu/drm/xe/regs/xe_gpu_commands.h >> index 1f9c32e694c6..6ff6c7aaa648 100644 >> --- a/drivers/gpu/drm/xe/regs/xe_gpu_commands.h >> +++ b/drivers/gpu/drm/xe/regs/xe_gpu_commands.h >> @@ -13,6 +13,7 @@ >> #define DST_ACCESS_TYPE_SHIFT 20 >> #define CCS_SIZE_MASK 0x3FF >> #define CCS_SIZE_SHIFT 8 >> +#define XE2_CCS_SIZE_SHIFT 9 > It would be preferred if we just removed the shifts and used > FIELD_PREP() to put values into proper field masks instead. > >> #define XY_CTRL_SURF_MOCS_MASK GENMASK(31, 26) >> #define XE2_XY_CTRL_SURF_MOCS_INDEX_MASK GENMASK(31, 28) >> #define NUM_CCS_BYTES_PER_BLOCK 256 >> diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c >> index e8b567708ac0..7ef068451b59 100644 >> --- a/drivers/gpu/drm/xe/xe_migrate.c >> +++ b/drivers/gpu/drm/xe/xe_migrate.c >> @@ -526,21 +526,30 @@ static void emit_copy_ccs(struct xe_gt *gt, struct xe_bb *bb, >> struct xe_device *xe = gt_to_xe(gt); >> u32 *cs = bb->cs + bb->len; >> u32 num_ccs_blks; >> + u32 num_pages; >> + u32 ccs_copy_size; >> u32 mocs; >> >> - num_ccs_blks = DIV_ROUND_UP(xe_device_ccs_bytes(gt_to_xe(gt), size), >> - NUM_CCS_BYTES_PER_BLOCK); >> - xe_gt_assert(gt, num_ccs_blks <= NUM_CCS_BLKS_PER_XFER); >> + if (GRAPHICS_VERx100(xe) >= 2000) { >> + num_pages = DIV_ROUND_UP(size, XE_PAGE_SIZE); >> + xe_gt_assert(gt, num_pages <= 1024); > Rather than using a magic number here, FIELD_FIT() might be better. We > can do the same for the other branch as well and get rid of > NUM_CCS_BLKS_PER_XFER too. Sure. Using FIELD_FIT is much better. Will update in next version. BR Himal > > Matt > >> >> - if (GRAPHICS_VERx100(xe) >= 2000) >> + ccs_copy_size = ((num_pages - 1) & CCS_SIZE_MASK) << XE2_CCS_SIZE_SHIFT; >> mocs = FIELD_PREP(XE2_XY_CTRL_SURF_MOCS_INDEX_MASK, gt->mocs.uc_index); >> - else >> + >> + } else { >> + num_ccs_blks = DIV_ROUND_UP(xe_device_ccs_bytes(gt_to_xe(gt), size), >> + NUM_CCS_BYTES_PER_BLOCK); >> + xe_gt_assert(gt, num_ccs_blks <= NUM_CCS_BLKS_PER_XFER); >> + >> + ccs_copy_size = ((num_ccs_blks - 1) & CCS_SIZE_MASK) << CCS_SIZE_SHIFT; >> mocs = FIELD_PREP(XY_CTRL_SURF_MOCS_MASK, gt->mocs.uc_index); >> + } >> >> *cs++ = XY_CTRL_SURF_COPY_BLT | >> (src_is_indirect ? 0x0 : 0x1) << SRC_ACCESS_TYPE_SHIFT | >> (dst_is_indirect ? 0x0 : 0x1) << DST_ACCESS_TYPE_SHIFT | >> - ((num_ccs_blks - 1) & CCS_SIZE_MASK) << CCS_SIZE_SHIFT; >> + ccs_copy_size; >> *cs++ = lower_32_bits(src_ofs); >> *cs++ = upper_32_bits(src_ofs) | mocs; >> *cs++ = lower_32_bits(dst_ofs); >> -- >> 2.25.1 >>