From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 26321CEBF61 for ; Mon, 17 Nov 2025 14:32:32 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DB1DC10E1C5; Mon, 17 Nov 2025 14:32:31 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="oFwaUlw7"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) by gabe.freedesktop.org (Postfix) with ESMTPS id 09CB410E1C5 for ; Mon, 17 Nov 2025 14:32:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1763389950; x=1794925950; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=KfxQFXEK4yzMMn4TSyhivSCtIHVi/MsMyiQwPI/CvmA=; b=oFwaUlw7qeCDWaGAoSpdSjQM6zeV3+Xtq631k1IBdahfjrSkx3QyRnlj sbuMdsQtzolkHkh/l4wq4mCbujm/IN3pkTnpT8ztptlv8z4kUMOkhqatY Y7q7pyv5MYlDOYZZQkoh91y0SDKTIr0VhMxHVSygXIkpFkvYt4uoW1HzF u+nIOCJXH+cXLExOoIZteEaHpU5Vxf0RJnPrmowdEzgIYdUSgIAGHmLqO fMyJTi+3Xkme/EmDLoLEiO750Ut2uPoXPpIBAFlwS1sSF9qFdS5XcyAHl KIf/T/anGDxwk4q4etR4imTuQJfvhcRxkuJjOisGb6F8fxmGmHoTcBMJf g==; X-CSE-ConnectionGUID: lJCzYdlVR5WbcBB3kBu8Qw== X-CSE-MsgGUID: +lWBKyyqSk2hNNp7qGWCeQ== X-IronPort-AV: E=McAfee;i="6800,10657,11616"; a="83015468" X-IronPort-AV: E=Sophos;i="6.19,312,1754982000"; d="scan'208";a="83015468" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Nov 2025 06:32:29 -0800 X-CSE-ConnectionGUID: pxi9OKVUSQWjO1e6P156Tw== X-CSE-MsgGUID: Va9zLRlITrqrKjqbVszm2Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.19,312,1754982000"; d="scan'208";a="194785141" Received: from egrumbac-mobl6.ger.corp.intel.com (HELO [10.245.244.254]) ([10.245.244.254]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Nov 2025 06:32:28 -0800 Message-ID: <41d8f41c-37e0-4cd7-b1f5-2e9f474a1e58@intel.com> Date: Mon, 17 Nov 2025 14:32:26 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4] drm/xe/uapi: Add NO_COMPRESSION BO flag and query capability To: Sanjay Yadav , intel-xe@lists.freedesktop.org Cc: =?UTF-8?Q?Jos=C3=A9_Roberto_de_Souza?= References: <20251117135327.2866171-2-sanjay.kumar.yadav@intel.com> Content-Language: en-GB From: Matthew Auld In-Reply-To: <20251117135327.2866171-2-sanjay.kumar.yadav@intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On 17/11/2025 13:53, Sanjay Yadav wrote: > Introduce DRM_XE_GEM_CREATE_FLAG_NO_COMPRESSION to let userspace > opt out of CCS compression on a per-BO basis. When set, the driver > maps this to XE_BO_FLAG_NO_COMPRESSION, skips CCS metadata > allocation/clearing, and rejects compressed PAT indices at vm_bind. > This avoids extra memory ops and manual CCS state handling for buffers. > > To allow userspace to detect at runtime whether the kernel supports this > feature, add DRM_XE_QUERY_CONFIG_FLAG_HAS_NO_COMPRESSION_HINT and expose > it via query_config() on Xe2+ platforms. > > v2 > - Changed error code from -EINVAL to -EOPNOTSUPP for unsupported flag > usage on pre-Xe2 platforms > - Fixed checkpatch warning in xe_vm.c > - Fixed kernel-doc formatting in xe_drm.h > > v3 > - Rebase > - Updated commit title and description > - Added UAPI for DRM_XE_QUERY_CONFIG_FLAG_HAS_NO_COMPRESSION_HINT and > exposed it via query_config() > > v4 > - Rebase > > Suggested-by: Matthew Auld > Suggested-by: José Roberto de Souza > Signed-off-by: Sanjay Yadav > --- > drivers/gpu/drm/xe/xe_bo.c | 15 +++++++++++++-- > drivers/gpu/drm/xe/xe_bo.h | 1 + > drivers/gpu/drm/xe/xe_pat.c | 9 ++++++++- > drivers/gpu/drm/xe/xe_pat.h | 15 +++++++++++++++ > drivers/gpu/drm/xe/xe_query.c | 3 +++ > drivers/gpu/drm/xe/xe_vm.c | 5 +++++ > include/uapi/drm/xe_drm.h | 16 ++++++++++++++++ > 7 files changed, 61 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c > index b0bd31d14bb9..c969c2bc4249 100644 > --- a/drivers/gpu/drm/xe/xe_bo.c > +++ b/drivers/gpu/drm/xe/xe_bo.c > @@ -3183,7 +3183,8 @@ int xe_gem_create_ioctl(struct drm_device *dev, void *data, > if (XE_IOCTL_DBG(xe, args->flags & > ~(DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING | > DRM_XE_GEM_CREATE_FLAG_SCANOUT | > - DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM))) > + DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM | > + DRM_XE_GEM_CREATE_FLAG_NO_COMPRESSION))) > return -EINVAL; > > if (XE_IOCTL_DBG(xe, args->handle)) > @@ -3205,6 +3206,12 @@ int xe_gem_create_ioctl(struct drm_device *dev, void *data, > if (args->flags & DRM_XE_GEM_CREATE_FLAG_SCANOUT) > bo_flags |= XE_BO_FLAG_SCANOUT; > > + if (args->flags & DRM_XE_GEM_CREATE_FLAG_NO_COMPRESSION) { > + if (GRAPHICS_VER(xe) < 20) > + return -EOPNOTSUPP; > + bo_flags |= XE_BO_FLAG_NO_COMPRESSION; > + } > + > bo_flags |= args->placement << (ffs(XE_BO_FLAG_SYSTEM) - 1); > > /* CCS formats need physical placement at a 64K alignment in VRAM. */ > @@ -3526,8 +3533,12 @@ bool xe_bo_needs_ccs_pages(struct xe_bo *bo) > * Compression implies coh_none, therefore we know for sure that WB > * memory can't currently use compression, which is likely one of the > * common cases. > + * Additionally, userspace may explicitly request no compression via the > + * DRM_XE_GEM_CREATE_FLAG_NO_COMPRESSION flag, which should also disable > + * CCS usage. > */ > - if (bo->cpu_caching == DRM_XE_GEM_CPU_CACHING_WB) > + if (bo->cpu_caching == DRM_XE_GEM_CPU_CACHING_WB || > + bo->flags & XE_BO_FLAG_NO_COMPRESSION) > return false; > > return true; > diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h > index 911d5b90461a..8ab4474129c3 100644 > --- a/drivers/gpu/drm/xe/xe_bo.h > +++ b/drivers/gpu/drm/xe/xe_bo.h > @@ -50,6 +50,7 @@ > #define XE_BO_FLAG_GGTT3 BIT(23) > #define XE_BO_FLAG_CPU_ADDR_MIRROR BIT(24) > #define XE_BO_FLAG_FORCE_USER_VRAM BIT(25) > +#define XE_BO_FLAG_NO_COMPRESSION BIT(26) > > /* this one is trigger internally only */ > #define XE_BO_FLAG_INTERNAL_TEST BIT(30) > diff --git a/drivers/gpu/drm/xe/xe_pat.c b/drivers/gpu/drm/xe/xe_pat.c > index 1b4d5d3def0f..1363a3e4e47e 100644 > --- a/drivers/gpu/drm/xe/xe_pat.c > +++ b/drivers/gpu/drm/xe/xe_pat.c > @@ -116,7 +116,8 @@ static const struct xe_pat_table_entry xelpg_pat_table[] = { > REG_FIELD_PREP(XE2_COH_MODE, __coh_mode), \ > .coh_mode = (BUILD_BUG_ON_ZERO(__coh_mode && comp_en) || __coh_mode) ? \ > XE_COH_AT_LEAST_1WAY : XE_COH_NONE, \ > - .valid = 1 \ > + .valid = 1, \ > + .compressed = comp_en, \ > } > > static const struct xe_pat_table_entry xe2_pat_table[] = { > @@ -202,6 +203,12 @@ bool xe_pat_index_get_comp_en(struct xe_device *xe, u16 pat_index) > return !!(xe->pat.table[pat_index].value & XE2_COMP_EN); > } > > +bool xe_pat_index_has_compression(struct xe_device *xe, u16 pat_index) > +{ > + xe_assert(xe, pat_index < xe->pat.n_entries); > + return xe->pat.table[pat_index].compressed; > +} Please use xe_pat_index_get_comp_en() instead. Otherwise lgtm. > + > static void program_pat(struct xe_gt *gt, const struct xe_pat_table_entry table[], > int n_entries) > { > diff --git a/drivers/gpu/drm/xe/xe_pat.h b/drivers/gpu/drm/xe/xe_pat.h > index b8559120989e..0bbe803b8ce5 100644 > --- a/drivers/gpu/drm/xe/xe_pat.h > +++ b/drivers/gpu/drm/xe/xe_pat.h > @@ -34,6 +34,11 @@ struct xe_pat_table_entry { > * @valid: Set to 1 if the entry is valid, 0 if it's reserved. > */ > u16 valid; > + > + /** > + * @compressed: Whether compression is enabled or not with @value. > + */ > + bool compressed; > }; > > /** > @@ -68,4 +73,14 @@ u16 xe_pat_index_get_coh_mode(struct xe_device *xe, u16 pat_index); > */ > bool xe_pat_index_get_comp_en(struct xe_device *xe, u16 pat_index); > > +/** > + * xe_pat_index_has_compression - Check if the given pat_index enables > + * compression. > + * @xe: xe device > + * @pat_index: The pat_index to query > + * > + * Note: Only applicable to Xe2+, where compression is part of the PAT index. > + */ > +bool xe_pat_index_has_compression(struct xe_device *xe, u16 pat_index); > + > #endif > diff --git a/drivers/gpu/drm/xe/xe_query.c b/drivers/gpu/drm/xe/xe_query.c > index 1c0915e2cc16..b392c9b3f0c9 100644 > --- a/drivers/gpu/drm/xe/xe_query.c > +++ b/drivers/gpu/drm/xe/xe_query.c > @@ -342,6 +342,9 @@ static int query_config(struct xe_device *xe, struct drm_xe_device_query *query) > if (xe->info.has_usm && IS_ENABLED(CONFIG_DRM_XE_GPUSVM)) > config->info[DRM_XE_QUERY_CONFIG_FLAGS] |= > DRM_XE_QUERY_CONFIG_FLAG_HAS_CPU_ADDR_MIRROR; > + if (GRAPHICS_VER(xe) >= 20) > + config->info[DRM_XE_QUERY_CONFIG_FLAGS] |= > + DRM_XE_QUERY_CONFIG_FLAG_HAS_NO_COMPRESSION_HINT; > config->info[DRM_XE_QUERY_CONFIG_FLAGS] |= > DRM_XE_QUERY_CONFIG_FLAG_HAS_LOW_LATENCY; > config->info[DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT] = > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c > index 7cac646bdf1c..e377c872ec0f 100644 > --- a/drivers/gpu/drm/xe/xe_vm.c > +++ b/drivers/gpu/drm/xe/xe_vm.c > @@ -3484,6 +3484,11 @@ static int xe_vm_bind_ioctl_validate_bo(struct xe_device *xe, struct xe_bo *bo, > { > u16 coh_mode; > > + /* Reject compressed PAT index for BO with NO_COMPRESSION flag */ > + if ((bo->flags & XE_BO_FLAG_NO_COMPRESSION) && > + xe_pat_index_has_compression(xe, pat_index)) > + return -EINVAL; > + > if (XE_IOCTL_DBG(xe, range > xe_bo_size(bo)) || > XE_IOCTL_DBG(xe, obj_offset > > xe_bo_size(bo) - range)) { > diff --git a/include/uapi/drm/xe_drm.h b/include/uapi/drm/xe_drm.h > index 47853659a705..b9840cb77360 100644 > --- a/include/uapi/drm/xe_drm.h > +++ b/include/uapi/drm/xe_drm.h > @@ -403,6 +403,9 @@ struct drm_xe_query_mem_regions { > * has low latency hint support > * - %DRM_XE_QUERY_CONFIG_FLAG_HAS_CPU_ADDR_MIRROR - Flag is set if the > * device has CPU address mirroring support > + * - %DRM_XE_QUERY_CONFIG_FLAG_HAS_NO_COMPRESSION_HINT - Flag is set if the > + * device supports the userspace hint %DRM_XE_GEM_CREATE_FLAG_NO_COMPRESSION. > + * This is exposed only on Xe2+. > * - %DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT - Minimal memory alignment > * required by this device, typically SZ_4K or SZ_64K > * - %DRM_XE_QUERY_CONFIG_VA_BITS - Maximum bits of a virtual address > @@ -421,6 +424,7 @@ struct drm_xe_query_config { > #define DRM_XE_QUERY_CONFIG_FLAG_HAS_VRAM (1 << 0) > #define DRM_XE_QUERY_CONFIG_FLAG_HAS_LOW_LATENCY (1 << 1) > #define DRM_XE_QUERY_CONFIG_FLAG_HAS_CPU_ADDR_MIRROR (1 << 2) > + #define DRM_XE_QUERY_CONFIG_FLAG_HAS_NO_COMPRESSION_HINT (1 << 3) > #define DRM_XE_QUERY_CONFIG_MIN_ALIGNMENT 2 > #define DRM_XE_QUERY_CONFIG_VA_BITS 3 > #define DRM_XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITY 4 > @@ -791,6 +795,17 @@ struct drm_xe_device_query { > * need to use VRAM for display surfaces, therefore the kernel requires > * setting this flag for such objects, otherwise an error is thrown on > * small-bar systems. > + * - %DRM_XE_GEM_CREATE_FLAG_NO_COMPRESSION - Allows userspace to > + * hint that compression (CCS) should be disabled for the buffer being > + * created. This can avoid unnecessary memory operations and CCS state > + * management. > + * On pre-Xe2 platforms, this flag is currently rejected as compression > + * control is not supported via PAT index. On Xe2+ platforms, compression > + * is controlled via PAT entries. If this flag is set, the driver will reject > + * any VM bind that requests a PAT index enabling compression for this BO. > + * Note: On dGPU platforms, there is currently no change in behavior with > + * this flag, but future improvements may leverage it. The current benefit is > + * primarily applicable to iGPU platforms. > * > * @cpu_caching supports the following values: > * - %DRM_XE_GEM_CPU_CACHING_WB - Allocate the pages with write-back > @@ -837,6 +852,7 @@ struct drm_xe_gem_create { > #define DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING (1 << 0) > #define DRM_XE_GEM_CREATE_FLAG_SCANOUT (1 << 1) > #define DRM_XE_GEM_CREATE_FLAG_NEEDS_VISIBLE_VRAM (1 << 2) > +#define DRM_XE_GEM_CREATE_FLAG_NO_COMPRESSION (1 << 3) > /** > * @flags: Flags, currently a mask of memory instances of where BO can > * be placed