From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0A78EE7718B for ; Tue, 24 Dec 2024 03:11:38 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B7B1510E20D; Tue, 24 Dec 2024 03:11:37 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="ZMRtXkIw"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) by gabe.freedesktop.org (Postfix) with ESMTPS id C2A5E10E20D for ; Tue, 24 Dec 2024 03:11:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1735009897; x=1766545897; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version:content-transfer-encoding; bh=AeoUOVLj7KmdasO53QNOV26eg7ERBgNPHCJubvNaxdw=; b=ZMRtXkIw1XRcf7zC6+cAmpz9koXpHmUylic5sJ2Ywjo1Xe9lRXqtTg9D /l7H6J9DSG2/bQyM7NwKCRsWnhHYeNbHv6bVP4N31DqTGcBbuZPUBDrej DVV7DB0pvpb3LanBQc7aDB9TUYkbtDIv1bK05B7EY1vnOg/vTrCIsYDX4 HSnWmwsP4zYFwB1m5PAefPVX+dMeXo9SRI7KENN251i9TlA1h2XJgkHpQ /Lr8n7EObbDr+RLeLwYDnnjX73UYdCtFuvSIv074ZTfSQyq52Si5j/dU7 NDF6B3dqxlRTkqoudl4BTBVkFrnlwQhCpaTgg0Og54uj4OzPlNGkzO38t Q==; X-CSE-ConnectionGUID: pXjnUVa7Q76vCkkaNWNBVA== X-CSE-MsgGUID: w4FZD5CGRcKLGZRl7AXOcg== X-IronPort-AV: E=McAfee;i="6700,10204,11295"; a="35366896" X-IronPort-AV: E=Sophos;i="6.12,259,1728975600"; d="scan'208";a="35366896" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Dec 2024 19:11:33 -0800 X-CSE-ConnectionGUID: Cbcl14HJQ+iIXa8FtZOS/A== X-CSE-MsgGUID: rPTfkQEETNu8BojT9PoSCw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,259,1728975600"; d="scan'208";a="104226446" Received: from orsosgc001.jf.intel.com (HELO orsosgc001.intel.com) ([10.165.21.142]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Dec 2024 19:08:09 -0800 Date: Mon, 23 Dec 2024 19:08:09 -0800 Message-ID: <85cyhh22xy.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: =?ISO-8859-1?Q?Jos=E9?= Roberto de Souza Cc: , Thomas =?ISO-8859-1?Q?Hellstr=F6m?= , Jonathan Cavitt Subject: Re: [PATCH] drm/xe: Remove 'Force write completion' from MI_STORE_DATA_IMM In-Reply-To: <20241223182918.126288-1-jose.souza@intel.com> References: <20241223182918.126288-1-jose.souza@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-redhat-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Mon, 23 Dec 2024 10:29:18 -0800, Jos=E9 Roberto de Souza wrote: > > In all places the MI_STORE_DATA_IMM are not followed by a read of > the same memory address in the same batch buffer and the posted writes > are flushed with PIPE_CONTROL or MI_FLUSH_DW in xe_ring_ops.c functions > so there is no need to set this register. > > This is not a revert because I think it is worthy keep the register > definition and the comment on top of it so future readers don't > get the same wrong conclusion as I did. > > Cc: Thomas Hellstr=F6m > Cc: Ashutosh Dixit > Fixes: 1460bb1fef9c ("drm/xe: Force write completion of MI_STORE_DATA_IMM= ") > Signed-off-by: Jos=E9 Roberto de Souza > --- > drivers/gpu/drm/xe/instructions/xe_mi_commands.h | 4 ++++ > drivers/gpu/drm/xe/xe_migrate.c | 9 ++------- > drivers/gpu/drm/xe/xe_oa.c | 1 - > drivers/gpu/drm/xe/xe_ring_ops.c | 6 ++---- > 4 files changed, 8 insertions(+), 12 deletions(-) > > diff --git a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h b/drivers/g= pu/drm/xe/instructions/xe_mi_commands.h > index f4ee910f09432..8e9884aaa4b8a 100644 > --- a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h > +++ b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h > @@ -35,6 +35,10 @@ > > #define MI_STORE_DATA_IMM __MI_INSTR(0x20) > #define MI_SDI_GGTT REG_BIT(22) > +/* > + * PIPE_CONTROL and MI_FLUSH_DW flushes all posted writes, only needed if > + * written value is read in batch buffer I don't think this is correct. If a previous posted write writes data and that data is lying in GPU cache, this or a future batch buffer should be able to read that data, even without this flag. The purpose of MI_FORCE_WRITE_COMPLETION_CHECK flag seems to be a sort of SFENCE for the GPU, and SFENCE equivalent can also be done using MI_FLUSH_DW and PIPE_CONTROL. It is the latter which seem to be used in the driver now, rather than this flag. So the purpose is to make the writes globally visible. So if you want to leave a comment here you'd have to say something like that and drop "only needed if written value is read in batch buffer". So something like: /* * At present PIPE_CONTROL and MI_FLUSH_DW flush posted writes (when needed) * in the Xe driver. This flag provides another means of doing the same. */ Otherwise just revert the patch. Thomas can review the wording. Also the OA instance has or will be been soon deleted so is no longer relevant. Sorry, I should have been more careful while reviewing. Thanks. -- Ashutosh > + */ > #define MI_FORCE_WRITE_COMPLETION_CHECK REG_BIT(10) > #define MI_SDI_LEN_DW GENMASK(9, 0) > #define MI_SDI_NUM_DW(x) REG_FIELD_PREP(MI_SDI_LEN_DW, (x) + 3 - 2) > diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migr= ate.c > index 8b32fad678782..9b3bcb789eee3 100644 > --- a/drivers/gpu/drm/xe/xe_migrate.c > +++ b/drivers/gpu/drm/xe/xe_migrate.c > @@ -581,9 +581,7 @@ static void emit_pte(struct xe_migrate *m, > while (ptes) { > u32 chunk =3D min(MAX_PTE_PER_SDI, ptes); > > - bb->cs[bb->len++] =3D MI_STORE_DATA_IMM | > - MI_FORCE_WRITE_COMPLETION_CHECK | > - MI_SDI_NUM_QW(chunk); > + bb->cs[bb->len++] =3D MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk); > bb->cs[bb->len++] =3D ofs; > bb->cs[bb->len++] =3D 0; > > @@ -1225,9 +1223,7 @@ static void write_pgtable(struct xe_tile *tile, str= uct xe_bb *bb, u64 ppgtt_ofs, > if (!(bb->len & 1)) > bb->cs[bb->len++] =3D MI_NOOP; > > - bb->cs[bb->len++] =3D MI_STORE_DATA_IMM | > - MI_FORCE_WRITE_COMPLETION_CHECK | > - MI_SDI_NUM_QW(chunk); > + bb->cs[bb->len++] =3D MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk); > bb->cs[bb->len++] =3D lower_32_bits(addr); > bb->cs[bb->len++] =3D upper_32_bits(addr); > if (pt_op->bind) > @@ -1392,7 +1388,6 @@ __xe_migrate_update_pgtables(struct xe_migrate *m, > u32 idx =3D 0; > > bb->cs[bb->len++] =3D MI_STORE_DATA_IMM | > - MI_FORCE_WRITE_COMPLETION_CHECK | > MI_SDI_NUM_QW(chunk); > bb->cs[bb->len++] =3D ofs; > bb->cs[bb->len++] =3D 0; /* upper_32_bits */ > diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c > index ae94490b0eac8..6b440a5470255 100644 > --- a/drivers/gpu/drm/xe/xe_oa.c > +++ b/drivers/gpu/drm/xe/xe_oa.c > @@ -691,7 +691,6 @@ static void xe_oa_store_flex(struct xe_oa_stream *str= eam, struct xe_lrc *lrc, > > do { > bb->cs[bb->len++] =3D MI_STORE_DATA_IMM | MI_SDI_GGTT | > - MI_FORCE_WRITE_COMPLETION_CHECK | > MI_SDI_NUM_DW(1); > bb->cs[bb->len++] =3D offset + flex->offset * sizeof(u32); > bb->cs[bb->len++] =3D 0; > diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_rin= g_ops.c > index 3a75a08b6be92..0be4f489d3e12 100644 > --- a/drivers/gpu/drm/xe/xe_ring_ops.c > +++ b/drivers/gpu/drm/xe/xe_ring_ops.c > @@ -72,8 +72,7 @@ static int emit_user_interrupt(u32 *dw, int i) > > static int emit_store_imm_ggtt(u32 addr, u32 value, u32 *dw, int i) > { > - dw[i++] =3D MI_STORE_DATA_IMM | MI_SDI_GGTT | > - MI_FORCE_WRITE_COMPLETION_CHECK | MI_SDI_NUM_DW(1); > + dw[i++] =3D MI_STORE_DATA_IMM | MI_SDI_GGTT | MI_SDI_NUM_DW(1); > dw[i++] =3D addr; > dw[i++] =3D 0; > dw[i++] =3D value; > @@ -163,8 +162,7 @@ static int emit_pipe_invalidate(u32 mask_flags, bool = invalidate_tlb, u32 *dw, > static int emit_store_imm_ppgtt_posted(u64 addr, u64 value, > u32 *dw, int i) > { > - dw[i++] =3D MI_STORE_DATA_IMM | MI_FORCE_WRITE_COMPLETION_CHECK | > - MI_SDI_NUM_QW(1); > + dw[i++] =3D MI_STORE_DATA_IMM | MI_SDI_NUM_QW(1); > dw[i++] =3D lower_32_bits(addr); > dw[i++] =3D upper_32_bits(addr); > dw[i++] =3D lower_32_bits(value); > -- > 2.47.1 >