On Wed, Dec 06, 2023 at 10:01:19AM +0530, Himal Prasad Ghimiray wrote:Each byte of CCS data now represents 512 bytes of main memory data. Allocate extra pages to handle ccs region for igfx too.This description seems confusing. Explicitly allocating memory for CCS data sounds more like the legacy AuxCCS rather than FlatCCS. For FlatCCS, the storage for the CCS data is already pre-allocated at a well-defined location (I'm assuming it's in some kind of stolen memory on an igpu?
On Igpu flat ccs is iGPU firmware reserved memory. Driver needs to allocate extra
system memory to hold ccs metadata while evicting.
On discrete GPUs, if a surface was being migrated from LMEM to SMEM, then you'd probably need extra storage for the SMEM copy. But that doesn't seem like it would be relevant to an igpu since there's no lmem<->smem migration happening.
Incase of igfx we need to store flat cccs metadata when bo is moved in/out of
gpu domain. CCS metadata needs to copied from flat ccs region to
extra pages allocated
in bo when bo moves from gpu domain to system domain and
vice-versa.
As a general comment, it might be worth starting this series with a patch that describes and documents how FlatCCS actually works on an igpu. Since the main surfaces are in smem rather than a separate lmem area, how does the CCS work? Where does the CCS data live and how are addresses of main surfaces (in smem) translated to CCS offsets? Is there CCS space set aside for the entire SMEM physical address space, even though a lot of that memory is going to be used for non-graphics purposes?
Bspec:58796 v2: - For dgfx ensure system bit is not set. - Modify comments.(Thomas) Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> --- drivers/gpu/drm/xe/regs/xe_gpu_commands.h | 2 +- drivers/gpu/drm/xe/xe_bo.c | 14 +++++++++----- drivers/gpu/drm/xe/xe_device.c | 2 +- 3 files changed, 11 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/xe/regs/xe_gpu_commands.h b/drivers/gpu/drm/xe/regs/xe_gpu_commands.h index f1c5bf203b3d..1f9c32e694c6 100644 --- a/drivers/gpu/drm/xe/regs/xe_gpu_commands.h +++ b/drivers/gpu/drm/xe/regs/xe_gpu_commands.h @@ -16,7 +16,7 @@ #define XY_CTRL_SURF_MOCS_MASK GENMASK(31, 26) #define XE2_XY_CTRL_SURF_MOCS_INDEX_MASK GENMASK(31, 28) #define NUM_CCS_BYTES_PER_BLOCK 256 -#define NUM_BYTES_PER_CCS_BYTE 256 +#define NUM_BYTES_PER_CCS_BYTE(_xe) (GRAPHICS_VER(_xe) >= 20 ? 512 : 256)Changes like this that change platform-specific Xe1 vs Xe2 details should probably be kept in a separate patch from the more general "support FlatCCS on an igpu" work happening here.
Seperate out this change into another patch.
BR
Himal
Matt#define NUM_CCS_BLKS_PER_XFER 1024 #define XY_FAST_COLOR_BLT_CMD (2 << 29 | 0x44 << 22) diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index 72dc4a4eed4e..81630838d769 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -2173,8 +2173,8 @@ int xe_bo_evict(struct xe_bo *bo, bool force_alloc) * placed in system memory. * @bo: The xe_bo * - * If a bo has an allowable placement in XE_PL_TT memory, it can't use - * flat CCS compression, because the GPU then has no way to access the + * For dgfx if a bo has an allowable placement in XE_PL_TT memory, it can't + * use flat CCS compression, because the GPU then has no way to access the * CCS metadata using relevant commands. For the opposite case, we need to * allocate storage for the CCS metadata when the BO is not resident in * VRAM memory. @@ -2183,9 +2183,13 @@ int xe_bo_evict(struct xe_bo *bo, bool force_alloc) */ bool xe_bo_needs_ccs_pages(struct xe_bo *bo) { - return bo->ttm.type == ttm_bo_type_device && - !(bo->flags & XE_BO_CREATE_SYSTEM_BIT) && - (bo->flags & XE_BO_CREATE_VRAM_MASK); + struct xe_device *xe = xe_bo_device(bo); + + return (xe_device_has_flat_ccs(xe) && + bo->ttm.type == ttm_bo_type_device && + ((IS_DGFX(xe) && (bo->flags & XE_BO_CREATE_VRAM_MASK) && + !(bo->flags & XE_BO_CREATE_SYSTEM_BIT)) || + (!IS_DGFX(xe) && (bo->flags & XE_BO_CREATE_SYSTEM_BIT)))); } /** diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index 400fa1ac6168..50c87f03c51c 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -605,7 +605,7 @@ void xe_device_wmb(struct xe_device *xe) u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size) { return xe_device_has_flat_ccs(xe) ? - DIV_ROUND_UP(size, NUM_BYTES_PER_CCS_BYTE) : 0; + DIV_ROUND_UP(size, NUM_BYTES_PER_CCS_BYTE(xe)) : 0; } bool xe_device_mem_access_ongoing(struct xe_device *xe) -- 2.25.1