From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8A91BF3092A for ; Thu, 5 Mar 2026 10:42:00 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5629910E247; Thu, 5 Mar 2026 10:42:00 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="Yb+IJevl"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) by gabe.freedesktop.org (Postfix) with ESMTPS id F16D610E247 for ; Thu, 5 Mar 2026 10:41:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1772707319; x=1804243319; h=message-id:subject:from:to:cc:date:in-reply-to: references:content-transfer-encoding:mime-version; bh=+4RM+A5IlAE1SW1jd+O3zHnvoCCQBHuUUTAqBqocXj4=; b=Yb+IJevlU7ozQnN+Y6bFVjwV8MxdI1upQfwkwnE95GpaJKYtiXrCaS5j pxcVQiy3QYx04B6aIMnQ3pRnPGZqlMFLMTVZvaT7aewoyTEq9Ek/BIcPi F8tLWkOHMzDEgBVjh5BrZAvoMwQNEmyiSO7r6lW6Swi8s1HY19C0oY7lV mTQ5eboc0Ait9bCNGu8MS6gzeT1s+GcsrtD+yWyW+WNYbYcqeFpc0L3tv fHTV505OletZ0LaliWwN7v/IuivoFIWaZ5kw7I5TzN6KxlEa7SggpajVn mqszJaII4FthBc7lAKP804f7iiOIyQfs4WxU7AKxbh5ynYgM77neVkh0t w==; X-CSE-ConnectionGUID: B8xiUAFUQzmE3bzEuTbRZA== X-CSE-MsgGUID: H3F7QPmQSnKxBjMEl/zASQ== X-IronPort-AV: E=McAfee;i="6800,10657,11719"; a="73859972" X-IronPort-AV: E=Sophos;i="6.23,102,1770624000"; d="scan'208";a="73859972" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Mar 2026 02:41:58 -0800 X-CSE-ConnectionGUID: x1jJKGUnTkuLgcW6WGZ7Dg== X-CSE-MsgGUID: 5JIIavIlTLupR//HJdpO6w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,102,1770624000"; d="scan'208";a="223329020" Received: from vpanait-mobl.ger.corp.intel.com (HELO [10.245.244.71]) ([10.245.244.71]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Mar 2026 02:41:57 -0800 Message-ID: Subject: Re: [PATCH V5 1/4] drm/xe/xe3p_lpg: flush shrinker bo cachelines manually From: Thomas =?ISO-8859-1?Q?Hellstr=F6m?= To: Tejas Upadhyay , intel-xe@lists.freedesktop.org Cc: matthew.auld@intel.com, carl.zhang@intel.com, jose.souza@intel.com Date: Thu, 05 Mar 2026 11:41:54 +0100 In-Reply-To: <20260303062441.1860959-7-tejas.upadhyay@intel.com> References: <20260303062441.1860959-6-tejas.upadhyay@intel.com> <20260303062441.1860959-7-tejas.upadhyay@intel.com> Organization: Intel Sweden AB, Registration Number: 556189-6027 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.58.3 (3.58.3-1.fc43) MIME-Version: 1.0 X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" Hi, Tejas On Tue, 2026-03-03 at 11:54 +0530, Tejas Upadhyay wrote: > XA, new pat_index introduced post xe3p_lpg, is memory shared between > the > CPU and GPU is treated differently from other GPU memory when the > Media > engine is power-gated. >=20 > XA is *always* flushed, like at the end-of-submssion (and maybe other > places), just that internally as an optimisation hw doesn't need to > make > that a full flush (which will also include XA) when Media is > off/powergated, since it doesn't need to worry about GT caches vs > Media > coherency, and only CPU vs GPU coherency, so can make that flush a > targeted XA flush, since stuff tagged with XA now means it's shared > with > the CPU. The main implication is that we now need to somehow flush > non-XA > before freeing system memory pages, otherwise dirty cachelines could > be > flushed after the free (like if Media suddenly turns on and does a > full > flush) >=20 > V3(Thomas/MattA/MattR): Restrict userptr with non-xa, then no need to > flush manually > V2(MattA): Expand commit description >=20 > Signed-off-by: Tejas Upadhyay > --- > =C2=A0drivers/gpu/drm/xe/xe_bo.c=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0 3 ++- > =C2=A0drivers/gpu/drm/xe/xe_device.c | 23 +++++++++++++++++++++++ > =C2=A0drivers/gpu/drm/xe/xe_device.h |=C2=A0 1 + > =C2=A03 files changed, 26 insertions(+), 1 deletion(-) >=20 > diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c > index d6c2cb959cdd..d2ee9701eae6 100644 > --- a/drivers/gpu/drm/xe/xe_bo.c > +++ b/drivers/gpu/drm/xe/xe_bo.c > @@ -689,7 +689,8 @@ static int xe_bo_trigger_rebind(struct xe_device > *xe, struct xe_bo *bo, > =C2=A0 > =C2=A0 if (!xe_vm_in_fault_mode(vm)) { > =C2=A0 drm_gpuvm_bo_evict(vm_bo, true); > - continue; > + if (!xe_device_is_l2_flush_optimized(xe)) > + continue; Could you please add a code comment here something along the lines of "L2 cache may not be flushed, so ensure that is done in xe_vm_invalidate_vma() below" With that, Reviewed-by: Thomas Hellstr=C3=B6m I also think a possible follow-up here would be to check the PAT indices of the vmas we loop over only invalidate if they indicate non- XA? If that makes sense, please consider as a follow-up patch. Thanks, Thomas > =C2=A0 } > =C2=A0 > =C2=A0 if (!idle) { > diff --git a/drivers/gpu/drm/xe/xe_device.c > b/drivers/gpu/drm/xe/xe_device.c > index 4b68a2d55651..94c9f17da4b4 100644 > --- a/drivers/gpu/drm/xe/xe_device.c > +++ b/drivers/gpu/drm/xe/xe_device.c > @@ -1097,6 +1097,29 @@ static void tdf_request_sync(struct xe_device > *xe) > =C2=A0 } > =C2=A0} > =C2=A0 > +/** > + * xe_device_is_l2_flush_optimized - if L2 flush is optimized by HW > + * @xe: The device to check. > + * > + * Return: true if the HW device optimizing L2 flush, false > otherwise. > + */ > +bool xe_device_is_l2_flush_optimized(struct xe_device *xe) > +{ > + /* XA is *always* flushed, like at the end-of-submssion (and > maybe other > + * places), just that internally as an optimisation hw > doesn't need to make > + * that a full flush (which will also include XA) when Media > is > + * off/powergated, since it doesn't need to worry about GT > caches vs Media > + * coherency, and only CPU vs GPU coherency, so can make > that flush a > + * targeted XA flush, since stuff tagged with XA now means > it's shared with > + * the CPU. The main implication is that we now need to > somehow flush non-XA before > + * freeing system memory pages, otherwise dirty cachelines > could be flushed after the free > + * (like if Media suddenly turns on and does a full flush) > + */ > + if (GRAPHICS_VER(xe) >=3D 35 && !IS_DGFX(xe)) > + return true; > + return false; > +} > + > =C2=A0void xe_device_l2_flush(struct xe_device *xe) > =C2=A0{ > =C2=A0 struct xe_gt *gt; > diff --git a/drivers/gpu/drm/xe/xe_device.h > b/drivers/gpu/drm/xe/xe_device.h > index 39464650533b..dfbf96e12d2e 100644 > --- a/drivers/gpu/drm/xe/xe_device.h > +++ b/drivers/gpu/drm/xe/xe_device.h > @@ -184,6 +184,7 @@ void xe_device_snapshot_print(struct xe_device > *xe, struct drm_printer *p); > =C2=A0u64 xe_device_canonicalize_addr(struct xe_device *xe, u64 address); > =C2=A0u64 xe_device_uncanonicalize_addr(struct xe_device *xe, u64 > address); > =C2=A0 > +bool xe_device_is_l2_flush_optimized(struct xe_device *xe); > =C2=A0void xe_device_td_flush(struct xe_device *xe); > =C2=A0void xe_device_l2_flush(struct xe_device *xe); > =C2=A0