From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: "José Roberto de Souza" <jose.souza@intel.com>
Cc: intel-xe@lists.freedesktop.org,
"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
"Jonathan Cavitt" <jonathan.cavitt@intel.com>
Subject: Re: [PATCH] drm/xe: Remove 'Force write completion' from MI_STORE_DATA_IMM
Date: Mon, 23 Dec 2024 19:08:09 -0800 [thread overview]
Message-ID: <85cyhh22xy.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <20241223182918.126288-1-jose.souza@intel.com>
On Mon, 23 Dec 2024 10:29:18 -0800, José Roberto de Souza wrote:
>
> In all places the MI_STORE_DATA_IMM are not followed by a read of
> the same memory address in the same batch buffer and the posted writes
> are flushed with PIPE_CONTROL or MI_FLUSH_DW in xe_ring_ops.c functions
> so there is no need to set this register.
>
> This is not a revert because I think it is worthy keep the register
> definition and the comment on top of it so future readers don't
> get the same wrong conclusion as I did.
>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
> Fixes: 1460bb1fef9c ("drm/xe: Force write completion of MI_STORE_DATA_IMM")
> Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> ---
> drivers/gpu/drm/xe/instructions/xe_mi_commands.h | 4 ++++
> drivers/gpu/drm/xe/xe_migrate.c | 9 ++-------
> drivers/gpu/drm/xe/xe_oa.c | 1 -
> drivers/gpu/drm/xe/xe_ring_ops.c | 6 ++----
> 4 files changed, 8 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> index f4ee910f09432..8e9884aaa4b8a 100644
> --- a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> +++ b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> @@ -35,6 +35,10 @@
>
> #define MI_STORE_DATA_IMM __MI_INSTR(0x20)
> #define MI_SDI_GGTT REG_BIT(22)
> +/*
> + * PIPE_CONTROL and MI_FLUSH_DW flushes all posted writes, only needed if
> + * written value is read in batch buffer
I don't think this is correct. If a previous posted write writes data and
that data is lying in GPU cache, this or a future batch buffer should be
able to read that data, even without this flag.
The purpose of MI_FORCE_WRITE_COMPLETION_CHECK flag seems to be a sort of
SFENCE for the GPU, and SFENCE equivalent can also be done using
MI_FLUSH_DW and PIPE_CONTROL. It is the latter which seem to be used in the
driver now, rather than this flag. So the purpose is to make the writes
globally visible.
So if you want to leave a comment here you'd have to say something like
that and drop "only needed if written value is read in batch
buffer". So something like:
/*
* At present PIPE_CONTROL and MI_FLUSH_DW flush posted writes (when needed)
* in the Xe driver. This flag provides another means of doing the same.
*/
Otherwise just revert the patch. Thomas can review the wording.
Also the OA instance has or will be been soon deleted so is no longer
relevant.
Sorry, I should have been more careful while reviewing.
Thanks.
--
Ashutosh
> + */
> #define MI_FORCE_WRITE_COMPLETION_CHECK REG_BIT(10)
> #define MI_SDI_LEN_DW GENMASK(9, 0)
> #define MI_SDI_NUM_DW(x) REG_FIELD_PREP(MI_SDI_LEN_DW, (x) + 3 - 2)
> diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
> index 8b32fad678782..9b3bcb789eee3 100644
> --- a/drivers/gpu/drm/xe/xe_migrate.c
> +++ b/drivers/gpu/drm/xe/xe_migrate.c
> @@ -581,9 +581,7 @@ static void emit_pte(struct xe_migrate *m,
> while (ptes) {
> u32 chunk = min(MAX_PTE_PER_SDI, ptes);
>
> - bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> - MI_FORCE_WRITE_COMPLETION_CHECK |
> - MI_SDI_NUM_QW(chunk);
> + bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk);
> bb->cs[bb->len++] = ofs;
> bb->cs[bb->len++] = 0;
>
> @@ -1225,9 +1223,7 @@ static void write_pgtable(struct xe_tile *tile, struct xe_bb *bb, u64 ppgtt_ofs,
> if (!(bb->len & 1))
> bb->cs[bb->len++] = MI_NOOP;
>
> - bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> - MI_FORCE_WRITE_COMPLETION_CHECK |
> - MI_SDI_NUM_QW(chunk);
> + bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk);
> bb->cs[bb->len++] = lower_32_bits(addr);
> bb->cs[bb->len++] = upper_32_bits(addr);
> if (pt_op->bind)
> @@ -1392,7 +1388,6 @@ __xe_migrate_update_pgtables(struct xe_migrate *m,
> u32 idx = 0;
>
> bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> - MI_FORCE_WRITE_COMPLETION_CHECK |
> MI_SDI_NUM_QW(chunk);
> bb->cs[bb->len++] = ofs;
> bb->cs[bb->len++] = 0; /* upper_32_bits */
> diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c
> index ae94490b0eac8..6b440a5470255 100644
> --- a/drivers/gpu/drm/xe/xe_oa.c
> +++ b/drivers/gpu/drm/xe/xe_oa.c
> @@ -691,7 +691,6 @@ static void xe_oa_store_flex(struct xe_oa_stream *stream, struct xe_lrc *lrc,
>
> do {
> bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_GGTT |
> - MI_FORCE_WRITE_COMPLETION_CHECK |
> MI_SDI_NUM_DW(1);
> bb->cs[bb->len++] = offset + flex->offset * sizeof(u32);
> bb->cs[bb->len++] = 0;
> diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
> index 3a75a08b6be92..0be4f489d3e12 100644
> --- a/drivers/gpu/drm/xe/xe_ring_ops.c
> +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
> @@ -72,8 +72,7 @@ static int emit_user_interrupt(u32 *dw, int i)
>
> static int emit_store_imm_ggtt(u32 addr, u32 value, u32 *dw, int i)
> {
> - dw[i++] = MI_STORE_DATA_IMM | MI_SDI_GGTT |
> - MI_FORCE_WRITE_COMPLETION_CHECK | MI_SDI_NUM_DW(1);
> + dw[i++] = MI_STORE_DATA_IMM | MI_SDI_GGTT | MI_SDI_NUM_DW(1);
> dw[i++] = addr;
> dw[i++] = 0;
> dw[i++] = value;
> @@ -163,8 +162,7 @@ static int emit_pipe_invalidate(u32 mask_flags, bool invalidate_tlb, u32 *dw,
> static int emit_store_imm_ppgtt_posted(u64 addr, u64 value,
> u32 *dw, int i)
> {
> - dw[i++] = MI_STORE_DATA_IMM | MI_FORCE_WRITE_COMPLETION_CHECK |
> - MI_SDI_NUM_QW(1);
> + dw[i++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(1);
> dw[i++] = lower_32_bits(addr);
> dw[i++] = upper_32_bits(addr);
> dw[i++] = lower_32_bits(value);
> --
> 2.47.1
>
next prev parent reply other threads:[~2024-12-24 3:11 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-23 18:29 [PATCH] drm/xe: Remove 'Force write completion' from MI_STORE_DATA_IMM José Roberto de Souza
2024-12-23 18:34 ` ✓ CI.Patch_applied: success for " Patchwork
2024-12-23 18:35 ` ✓ CI.checkpatch: " Patchwork
2024-12-23 18:36 ` ✓ CI.KUnit: " Patchwork
2024-12-23 18:50 ` [PATCH] " Cavitt, Jonathan
2024-12-23 18:54 ` ✓ CI.Build: success for " Patchwork
2024-12-23 18:56 ` ✓ CI.Hooks: " Patchwork
2024-12-23 18:58 ` ✓ CI.checksparse: " Patchwork
2024-12-23 19:52 ` ✓ Xe.CI.BAT: " Patchwork
2024-12-23 22:04 ` ✗ Xe.CI.Full: failure " Patchwork
2024-12-24 3:08 ` Dixit, Ashutosh [this message]
2024-12-26 11:09 ` [PATCH] " Thomas Hellström
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=85cyhh22xy.wl-ashutosh.dixit@intel.com \
--to=ashutosh.dixit@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jonathan.cavitt@intel.com \
--cc=jose.souza@intel.com \
--cc=thomas.hellstrom@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox