From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C3103E77188 for ; Thu, 19 Dec 2024 02:05:30 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8C9F810E25F; Thu, 19 Dec 2024 02:05:30 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="mqw/TAcT"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7DC2A10E25F for ; Thu, 19 Dec 2024 02:05:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1734573930; x=1766109930; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version:content-transfer-encoding; bh=KnxISyjf9sSxanT4fk0YgoV1jiQHygdFDVV/haPjmTc=; b=mqw/TAcT3iKvcQ2AdJoxaZ6LgeGyD/exg/U/ePxlVBxu1VDeCvSxk3v8 PyPwWzYvCjg6jxt36O47wSassP4RSpzerhohec7fNQ8QpFyWjnhjWQ52Z YQVUafYTfcWZh0RwBNWvtnri/hcI13Lerj7oxHlWqphkCYsAlG+ymZZAw nSYMEJ7mlDZxPubbt2zhlpEvaWNWLGCDfAU2YekCnh31QcSg0eTsSQLNx 0zjHFCAzFZuz4ALBmM17aRpQ4tpEG5WTsCDQeltWSy/nnDLZ+gztB0fUP 2WAcWR2wLhOEBVg8nmLwi3Dgw+/pggOawQlrOycxo+fG8uUsp1U4iDz+V g==; X-CSE-ConnectionGUID: 89Tysr76TVKVn2FuetkIPw== X-CSE-MsgGUID: uXA9rdI5TdaPpeXZvXlE2A== X-IronPort-AV: E=McAfee;i="6700,10204,11290"; a="35219828" X-IronPort-AV: E=Sophos;i="6.12,246,1728975600"; d="scan'208";a="35219828" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Dec 2024 18:05:30 -0800 X-CSE-ConnectionGUID: cXjfOMDBRBa5TCfkUA7Ipg== X-CSE-MsgGUID: +0lJfOJHTJiNkFcTJxd95g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,246,1728975600"; d="scan'208";a="98592024" Received: from orsosgc001.jf.intel.com (HELO orsosgc001.intel.com) ([10.165.21.142]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Dec 2024 18:05:28 -0800 Date: Wed, 18 Dec 2024 18:05:28 -0800 Message-ID: <85wmfw1l7b.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Lucas De Marchi , Thomas Hellstrom Cc: "Souza, Jose" , "intel-xe@lists.freedesktop.org" , Matthew Brost , Umesh Nerlige Ramappa Subject: Re: [PATCH] drm/xe: Force write completion of MI_STORE_DATA_IMM In-Reply-To: References: <20241217160732.46280-1-jose.souza@intel.com> <3ebdd73cecef119cbfb7fc902909c8c2adc5b0fc.camel@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-redhat-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Wed, 18 Dec 2024 09:27:36 -0800, Matthew Brost wrote: > Folks, > On Wed, Dec 18, 2024 at 08:38:49AM -0700, Souza, Jose wrote: > > On Tue, 2024-12-17 at 15:39 -0800, Matthew Brost wrote: > > > On Tue, Dec 17, 2024 at 08:07:32AM -0800, Jos=E9 Roberto de Souza wro= te: > > > > With Force write completion unset there is no guarantees of when the > > > > write will be globally visible what is not the behavior wanted. > > > > > > > > > > Do we want this backported? If so, maybe add a fixes? > > > > Not sure, I don't have an actual issue that is fixed by this but I thin= k would be good to have it backported. > > But what do you suggest? Add a fixes tag to the patch removing force pr= obe from LNL? > > > > Yea fixing LNL force probe removal sounds reasonable to me. Hmm the plan was add Fixes to this patch and also Cc: stable. Yet it was merged without these: 1460bb1fef9c ("drm/xe: Force write completion of MI_STORE_DATA_IMM") Lucas/Thomas, Would it be possible to send it to -fixes with a Cc:stable. We have patches depending on this one which we want to send to stable. Thanks. -- Ashutosh > > Matt > > > > > > > Matt > > > > > > > Signed-off-by: Jos=E9 Roberto de Souza > > > > --- > > > > drivers/gpu/drm/xe/instructions/xe_mi_commands.h | 13 +++++++------ > > > > drivers/gpu/drm/xe/xe_migrate.c | 11 ++++++++--- > > > > drivers/gpu/drm/xe/xe_oa.c | 4 +++- > > > > drivers/gpu/drm/xe/xe_ring_ops.c | 6 ++++-- > > > > 4 files changed, 22 insertions(+), 12 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h b/dri= vers/gpu/drm/xe/instructions/xe_mi_commands.h > > > > index 10ec2920d31b3..f4ee910f09432 100644 > > > > --- a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h > > > > +++ b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h > > > > @@ -33,12 +33,13 @@ > > > > #define MI_TOPOLOGY_FILTER __MI_INSTR(0xD) > > > > #define MI_FORCE_WAKEUP __MI_INSTR(0x1D) > > > > > > > > -#define MI_STORE_DATA_IMM __MI_INSTR(0x20) > > > > -#define MI_SDI_GGTT REG_BIT(22) > > > > -#define MI_SDI_LEN_DW GENMASK(9, 0) > > > > -#define MI_SDI_NUM_DW(x) REG_FIELD_PREP(MI_SDI_LEN_DW, (x) + 3 = - 2) > > > > -#define MI_SDI_NUM_QW(x) (REG_FIELD_PREP(MI_SDI_LEN_DW, 2 * (x)= + 3 - 2) | \ > > > > - REG_BIT(21)) > > > > +#define MI_STORE_DATA_IMM __MI_INSTR(0x20) > > > > +#define MI_SDI_GGTT REG_BIT(22) > > > > +#define MI_FORCE_WRITE_COMPLETION_CHECK REG_BIT(10) > > > > +#define MI_SDI_LEN_DW GENMASK(9, 0) > > > > +#define MI_SDI_NUM_DW(x) REG_FIELD_PREP(MI_SDI_LEN_DW, (x) + 3= - 2) > > > > +#define MI_SDI_NUM_QW(x) (REG_FIELD_PREP(MI_SDI_LEN_DW, 2 * (x= ) + 3 - 2) | \ > > > > + REG_BIT(21)) > > > > > > > > #define MI_LOAD_REGISTER_IMM __MI_INSTR(0x22) > > > > #define MI_LRI_LRM_CS_MMIO REG_BIT(19) > > > > diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/x= e_migrate.c > > > > index 1b97d90aaddaf..8b32fad678782 100644 > > > > --- a/drivers/gpu/drm/xe/xe_migrate.c > > > > +++ b/drivers/gpu/drm/xe/xe_migrate.c > > > > @@ -581,7 +581,9 @@ static void emit_pte(struct xe_migrate *m, > > > > while (ptes) { > > > > u32 chunk =3D min(MAX_PTE_PER_SDI, ptes); > > > > > > > > - bb->cs[bb->len++] =3D MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk); > > > > + bb->cs[bb->len++] =3D MI_STORE_DATA_IMM | > > > > + MI_FORCE_WRITE_COMPLETION_CHECK | > > > > + MI_SDI_NUM_QW(chunk); > > > > bb->cs[bb->len++] =3D ofs; > > > > bb->cs[bb->len++] =3D 0; > > > > > > > > @@ -1223,7 +1225,9 @@ static void write_pgtable(struct xe_tile *til= e, struct xe_bb *bb, u64 ppgtt_ofs, > > > > if (!(bb->len & 1)) > > > > bb->cs[bb->len++] =3D MI_NOOP; > > > > > > > > - bb->cs[bb->len++] =3D MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk); > > > > + bb->cs[bb->len++] =3D MI_STORE_DATA_IMM | > > > > + MI_FORCE_WRITE_COMPLETION_CHECK | > > > > + MI_SDI_NUM_QW(chunk); > > > > bb->cs[bb->len++] =3D lower_32_bits(addr); > > > > bb->cs[bb->len++] =3D upper_32_bits(addr); > > > > if (pt_op->bind) > > > > @@ -1388,7 +1392,8 @@ __xe_migrate_update_pgtables(struct xe_migrat= e *m, > > > > u32 idx =3D 0; > > > > > > > > bb->cs[bb->len++] =3D MI_STORE_DATA_IMM | > > > > - MI_SDI_NUM_QW(chunk); > > > > + MI_FORCE_WRITE_COMPLETION_CHECK | > > > > + MI_SDI_NUM_QW(chunk); > > > > bb->cs[bb->len++] =3D ofs; > > > > bb->cs[bb->len++] =3D 0; /* upper_32_bits */ > > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c > > > > index 56bf375a9d4bc..ae94490b0eac8 100644 > > > > --- a/drivers/gpu/drm/xe/xe_oa.c > > > > +++ b/drivers/gpu/drm/xe/xe_oa.c > > > > @@ -690,7 +690,9 @@ static void xe_oa_store_flex(struct xe_oa_strea= m *stream, struct xe_lrc *lrc, > > > > u32 offset =3D xe_bo_ggtt_addr(lrc->bo); > > > > > > > > do { > > > > - bb->cs[bb->len++] =3D MI_STORE_DATA_IMM | MI_SDI_GGTT | MI_SDI_N= UM_DW(1); > > > > + bb->cs[bb->len++] =3D MI_STORE_DATA_IMM | MI_SDI_GGTT | > > > > + MI_FORCE_WRITE_COMPLETION_CHECK | > > > > + MI_SDI_NUM_DW(1); > > > > bb->cs[bb->len++] =3D offset + flex->offset * sizeof(u32); > > > > bb->cs[bb->len++] =3D 0; > > > > bb->cs[bb->len++] =3D flex->value; > > > > diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/= xe_ring_ops.c > > > > index 0be4f489d3e12..3a75a08b6be92 100644 > > > > --- a/drivers/gpu/drm/xe/xe_ring_ops.c > > > > +++ b/drivers/gpu/drm/xe/xe_ring_ops.c > > > > @@ -72,7 +72,8 @@ static int emit_user_interrupt(u32 *dw, int i) > > > > > > > > static int emit_store_imm_ggtt(u32 addr, u32 value, u32 *dw, int i) > > > > { > > > > - dw[i++] =3D MI_STORE_DATA_IMM | MI_SDI_GGTT | MI_SDI_NUM_DW(1); > > > > + dw[i++] =3D MI_STORE_DATA_IMM | MI_SDI_GGTT | > > > > + MI_FORCE_WRITE_COMPLETION_CHECK | MI_SDI_NUM_DW(1); > > > > dw[i++] =3D addr; > > > > dw[i++] =3D 0; > > > > dw[i++] =3D value; > > > > @@ -162,7 +163,8 @@ static int emit_pipe_invalidate(u32 mask_flags,= bool invalidate_tlb, u32 *dw, > > > > static int emit_store_imm_ppgtt_posted(u64 addr, u64 value, > > > > u32 *dw, int i) > > > > { > > > > - dw[i++] =3D MI_STORE_DATA_IMM | MI_SDI_NUM_QW(1); > > > > + dw[i++] =3D MI_STORE_DATA_IMM | MI_FORCE_WRITE_COMPLETION_CHECK | > > > > + MI_SDI_NUM_QW(1); > > > > dw[i++] =3D lower_32_bits(addr); > > > > dw[i++] =3D upper_32_bits(addr); > > > > dw[i++] =3D lower_32_bits(value); > > > > -- > > > > 2.47.1 > > > > > >