All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] Revert "drm/xe: Force write completion of MI_STORE_DATA_IMM"
       [not found] <20241226162310.47489-1-jose.souza@intel.com>
@ 2024-12-27  3:31 ` Dixit, Ashutosh
  2024-12-27 14:24   ` Souza, Jose
  0 siblings, 1 reply; 4+ messages in thread
From: Dixit, Ashutosh @ 2024-12-27  3:31 UTC (permalink / raw)
  To: José Roberto de Souza; +Cc: intel-xe, Thomas Hellström

On Thu, 26 Dec 2024 08:23:10 -0800, José Roberto de Souza wrote:
>
> This reverts commit 1460bb1fef9ccf7390af0d74a15252442fd6effd.
>
> In all places the MI_STORE_DATA_IMM are not followed by a read of
> the same memory address in the same batch buffer and the posted writes
> are flushed with PIPE_CONTROL or MI_FLUSH_DW in xe_ring_ops.c functions
> so there is no need to set this register.

As I was saying this commit message should just be:

In all places where MI_STORE_DATA_IMM is used, posted writes are flushed
with PIPE_CONTROL or MI_FLUSH_DW in xe_ring_ops.c functions, so there is no
need to set this register.

So according to me the "are not followed by a read of the same memory
address in the same batch buffer" is incorrect, it will work even without
setting the flag.

Anyway apart from that nit, this is:

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>

>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
> Fixes: 1460bb1fef9c ("drm/xe: Force write completion of MI_STORE_DATA_IMM")
> Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> ---
>  drivers/gpu/drm/xe/instructions/xe_mi_commands.h | 13 ++++++-------
>  drivers/gpu/drm/xe/xe_migrate.c                  | 11 +++--------
>  drivers/gpu/drm/xe/xe_ring_ops.c                 |  6 ++----
>  3 files changed, 11 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> index f4ee910f09432..10ec2920d31b3 100644
> --- a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> +++ b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> @@ -33,13 +33,12 @@
>  #define MI_TOPOLOGY_FILTER		__MI_INSTR(0xD)
>  #define MI_FORCE_WAKEUP			__MI_INSTR(0x1D)
>
> -#define MI_STORE_DATA_IMM			__MI_INSTR(0x20)
> -#define   MI_SDI_GGTT				REG_BIT(22)
> -#define   MI_FORCE_WRITE_COMPLETION_CHECK	REG_BIT(10)
> -#define   MI_SDI_LEN_DW				GENMASK(9, 0)
> -#define   MI_SDI_NUM_DW(x)			REG_FIELD_PREP(MI_SDI_LEN_DW, (x) + 3 - 2)
> -#define   MI_SDI_NUM_QW(x)			(REG_FIELD_PREP(MI_SDI_LEN_DW, 2 * (x) + 3 - 2) | \
> -						 REG_BIT(21))
> +#define MI_STORE_DATA_IMM		__MI_INSTR(0x20)
> +#define   MI_SDI_GGTT			REG_BIT(22)
> +#define   MI_SDI_LEN_DW			GENMASK(9, 0)
> +#define   MI_SDI_NUM_DW(x)		REG_FIELD_PREP(MI_SDI_LEN_DW, (x) + 3 - 2)
> +#define   MI_SDI_NUM_QW(x)		(REG_FIELD_PREP(MI_SDI_LEN_DW, 2 * (x) + 3 - 2) | \
> +					 REG_BIT(21))
>
>  #define MI_LOAD_REGISTER_IMM		__MI_INSTR(0x22)
>  #define   MI_LRI_LRM_CS_MMIO		REG_BIT(19)
> diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
> index 8b32fad678782..1b97d90aaddaf 100644
> --- a/drivers/gpu/drm/xe/xe_migrate.c
> +++ b/drivers/gpu/drm/xe/xe_migrate.c
> @@ -581,9 +581,7 @@ static void emit_pte(struct xe_migrate *m,
>	while (ptes) {
>		u32 chunk = min(MAX_PTE_PER_SDI, ptes);
>
> -		bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> -				    MI_FORCE_WRITE_COMPLETION_CHECK |
> -				    MI_SDI_NUM_QW(chunk);
> +		bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk);
>		bb->cs[bb->len++] = ofs;
>		bb->cs[bb->len++] = 0;
>
> @@ -1225,9 +1223,7 @@ static void write_pgtable(struct xe_tile *tile, struct xe_bb *bb, u64 ppgtt_ofs,
>		if (!(bb->len & 1))
>			bb->cs[bb->len++] = MI_NOOP;
>
> -		bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> -				    MI_FORCE_WRITE_COMPLETION_CHECK |
> -				    MI_SDI_NUM_QW(chunk);
> +		bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk);
>		bb->cs[bb->len++] = lower_32_bits(addr);
>		bb->cs[bb->len++] = upper_32_bits(addr);
>		if (pt_op->bind)
> @@ -1392,8 +1388,7 @@ __xe_migrate_update_pgtables(struct xe_migrate *m,
>			u32 idx = 0;
>
>			bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> -					    MI_FORCE_WRITE_COMPLETION_CHECK |
> -					    MI_SDI_NUM_QW(chunk);
> +				MI_SDI_NUM_QW(chunk);
>			bb->cs[bb->len++] = ofs;
>			bb->cs[bb->len++] = 0; /* upper_32_bits */
>
> diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
> index c8ab37fa0d198..9f327f27c0726 100644
> --- a/drivers/gpu/drm/xe/xe_ring_ops.c
> +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
> @@ -72,8 +72,7 @@ static int emit_user_interrupt(u32 *dw, int i)
>
>  static int emit_store_imm_ggtt(u32 addr, u32 value, u32 *dw, int i)
>  {
> -	dw[i++] = MI_STORE_DATA_IMM | MI_SDI_GGTT |
> -		  MI_FORCE_WRITE_COMPLETION_CHECK | MI_SDI_NUM_DW(1);
> +	dw[i++] = MI_STORE_DATA_IMM | MI_SDI_GGTT | MI_SDI_NUM_DW(1);
>	dw[i++] = addr;
>	dw[i++] = 0;
>	dw[i++] = value;
> @@ -163,8 +162,7 @@ static int emit_pipe_invalidate(u32 mask_flags, bool invalidate_tlb, u32 *dw,
>  static int emit_store_imm_ppgtt_posted(u64 addr, u64 value,
>				       u32 *dw, int i)
>  {
> -	dw[i++] = MI_STORE_DATA_IMM | MI_FORCE_WRITE_COMPLETION_CHECK |
> -		  MI_SDI_NUM_QW(1);
> +	dw[i++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(1);
>	dw[i++] = lower_32_bits(addr);
>	dw[i++] = upper_32_bits(addr);
>	dw[i++] = lower_32_bits(value);
> --
> 2.47.1
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Revert "drm/xe: Force write completion of MI_STORE_DATA_IMM"
  2024-12-27  3:31 ` [PATCH] Revert "drm/xe: Force write completion of MI_STORE_DATA_IMM" Dixit, Ashutosh
@ 2024-12-27 14:24   ` Souza, Jose
  2024-12-27 17:26     ` Dixit, Ashutosh
  0 siblings, 1 reply; 4+ messages in thread
From: Souza, Jose @ 2024-12-27 14:24 UTC (permalink / raw)
  To: Dixit, Ashutosh
  Cc: intel-xe@lists.freedesktop.org, thomas.hellstrom@linux.intel.com

On Thu, 2024-12-26 at 19:31 -0800, Dixit, Ashutosh wrote:
> On Thu, 26 Dec 2024 08:23:10 -0800, José Roberto de Souza wrote:
> > 
> > This reverts commit 1460bb1fef9ccf7390af0d74a15252442fd6effd.
> > 
> > In all places the MI_STORE_DATA_IMM are not followed by a read of
> > the same memory address in the same batch buffer and the posted writes
> > are flushed with PIPE_CONTROL or MI_FLUSH_DW in xe_ring_ops.c functions
> > so there is no need to set this register.
> 
> As I was saying this commit message should just be:
> 
> In all places where MI_STORE_DATA_IMM is used, posted writes are flushed
> with PIPE_CONTROL or MI_FLUSH_DW in xe_ring_ops.c functions, so there is no
> need to set this register.
> 
> So according to me the "are not followed by a read of the same memory
> address in the same batch buffer" is incorrect, it will work even without
> setting the flag.

Not on Xe2+, with CS_PRIORITY_MEM_READ set there is a priority path to memory for reads and other for writes.

> 
> Anyway apart from that nit, this is:
> 
> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> 
> > 
> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > Fixes: 1460bb1fef9c ("drm/xe: Force write completion of MI_STORE_DATA_IMM")
> > Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> > ---
> >  drivers/gpu/drm/xe/instructions/xe_mi_commands.h | 13 ++++++-------
> >  drivers/gpu/drm/xe/xe_migrate.c                  | 11 +++--------
> >  drivers/gpu/drm/xe/xe_ring_ops.c                 |  6 ++----
> >  3 files changed, 11 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> > index f4ee910f09432..10ec2920d31b3 100644
> > --- a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> > +++ b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> > @@ -33,13 +33,12 @@
> >  #define MI_TOPOLOGY_FILTER		__MI_INSTR(0xD)
> >  #define MI_FORCE_WAKEUP			__MI_INSTR(0x1D)
> > 
> > -#define MI_STORE_DATA_IMM			__MI_INSTR(0x20)
> > -#define   MI_SDI_GGTT				REG_BIT(22)
> > -#define   MI_FORCE_WRITE_COMPLETION_CHECK	REG_BIT(10)
> > -#define   MI_SDI_LEN_DW				GENMASK(9, 0)
> > -#define   MI_SDI_NUM_DW(x)			REG_FIELD_PREP(MI_SDI_LEN_DW, (x) + 3 - 2)
> > -#define   MI_SDI_NUM_QW(x)			(REG_FIELD_PREP(MI_SDI_LEN_DW, 2 * (x) + 3 - 2) | \
> > -						 REG_BIT(21))
> > +#define MI_STORE_DATA_IMM		__MI_INSTR(0x20)
> > +#define   MI_SDI_GGTT			REG_BIT(22)
> > +#define   MI_SDI_LEN_DW			GENMASK(9, 0)
> > +#define   MI_SDI_NUM_DW(x)		REG_FIELD_PREP(MI_SDI_LEN_DW, (x) + 3 - 2)
> > +#define   MI_SDI_NUM_QW(x)		(REG_FIELD_PREP(MI_SDI_LEN_DW, 2 * (x) + 3 - 2) | \
> > +					 REG_BIT(21))
> > 
> >  #define MI_LOAD_REGISTER_IMM		__MI_INSTR(0x22)
> >  #define   MI_LRI_LRM_CS_MMIO		REG_BIT(19)
> > diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
> > index 8b32fad678782..1b97d90aaddaf 100644
> > --- a/drivers/gpu/drm/xe/xe_migrate.c
> > +++ b/drivers/gpu/drm/xe/xe_migrate.c
> > @@ -581,9 +581,7 @@ static void emit_pte(struct xe_migrate *m,
> > 	while (ptes) {
> > 		u32 chunk = min(MAX_PTE_PER_SDI, ptes);
> > 
> > -		bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> > -				    MI_FORCE_WRITE_COMPLETION_CHECK |
> > -				    MI_SDI_NUM_QW(chunk);
> > +		bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk);
> > 		bb->cs[bb->len++] = ofs;
> > 		bb->cs[bb->len++] = 0;
> > 
> > @@ -1225,9 +1223,7 @@ static void write_pgtable(struct xe_tile *tile, struct xe_bb *bb, u64 ppgtt_ofs,
> > 		if (!(bb->len & 1))
> > 			bb->cs[bb->len++] = MI_NOOP;
> > 
> > -		bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> > -				    MI_FORCE_WRITE_COMPLETION_CHECK |
> > -				    MI_SDI_NUM_QW(chunk);
> > +		bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk);
> > 		bb->cs[bb->len++] = lower_32_bits(addr);
> > 		bb->cs[bb->len++] = upper_32_bits(addr);
> > 		if (pt_op->bind)
> > @@ -1392,8 +1388,7 @@ __xe_migrate_update_pgtables(struct xe_migrate *m,
> > 			u32 idx = 0;
> > 
> > 			bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> > -					    MI_FORCE_WRITE_COMPLETION_CHECK |
> > -					    MI_SDI_NUM_QW(chunk);
> > +				MI_SDI_NUM_QW(chunk);
> > 			bb->cs[bb->len++] = ofs;
> > 			bb->cs[bb->len++] = 0; /* upper_32_bits */
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
> > index c8ab37fa0d198..9f327f27c0726 100644
> > --- a/drivers/gpu/drm/xe/xe_ring_ops.c
> > +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
> > @@ -72,8 +72,7 @@ static int emit_user_interrupt(u32 *dw, int i)
> > 
> >  static int emit_store_imm_ggtt(u32 addr, u32 value, u32 *dw, int i)
> >  {
> > -	dw[i++] = MI_STORE_DATA_IMM | MI_SDI_GGTT |
> > -		  MI_FORCE_WRITE_COMPLETION_CHECK | MI_SDI_NUM_DW(1);
> > +	dw[i++] = MI_STORE_DATA_IMM | MI_SDI_GGTT | MI_SDI_NUM_DW(1);
> > 	dw[i++] = addr;
> > 	dw[i++] = 0;
> > 	dw[i++] = value;
> > @@ -163,8 +162,7 @@ static int emit_pipe_invalidate(u32 mask_flags, bool invalidate_tlb, u32 *dw,
> >  static int emit_store_imm_ppgtt_posted(u64 addr, u64 value,
> > 				       u32 *dw, int i)
> >  {
> > -	dw[i++] = MI_STORE_DATA_IMM | MI_FORCE_WRITE_COMPLETION_CHECK |
> > -		  MI_SDI_NUM_QW(1);
> > +	dw[i++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(1);
> > 	dw[i++] = lower_32_bits(addr);
> > 	dw[i++] = upper_32_bits(addr);
> > 	dw[i++] = lower_32_bits(value);
> > --
> > 2.47.1
> > 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Revert "drm/xe: Force write completion of MI_STORE_DATA_IMM"
  2024-12-27 14:24   ` Souza, Jose
@ 2024-12-27 17:26     ` Dixit, Ashutosh
  2024-12-27 17:47       ` Souza, Jose
  0 siblings, 1 reply; 4+ messages in thread
From: Dixit, Ashutosh @ 2024-12-27 17:26 UTC (permalink / raw)
  To: Souza, Jose
  Cc: intel-xe@lists.freedesktop.org, thomas.hellstrom@linux.intel.com

On Fri, 27 Dec 2024 06:24:22 -0800, Souza, Jose wrote:
>
> On Thu, 2024-12-26 at 19:31 -0800, Dixit, Ashutosh wrote:
> > On Thu, 26 Dec 2024 08:23:10 -0800, José Roberto de Souza wrote:
> > >
> > > This reverts commit 1460bb1fef9ccf7390af0d74a15252442fd6effd.
> > >
> > > In all places the MI_STORE_DATA_IMM are not followed by a read of
> > > the same memory address in the same batch buffer and the posted writes
> > > are flushed with PIPE_CONTROL or MI_FLUSH_DW in xe_ring_ops.c functions
> > > so there is no need to set this register.
> >
> > As I was saying this commit message should just be:
> >
> > In all places where MI_STORE_DATA_IMM is used, posted writes are flushed
> > with PIPE_CONTROL or MI_FLUSH_DW in xe_ring_ops.c functions, so there is no
> > need to set this register.
> >
> > So according to me the "are not followed by a read of the same memory
> > address in the same batch buffer" is incorrect, it will work even without
> > setting the flag.
>
> Not on Xe2+, with CS_PRIORITY_MEM_READ set there is a priority path to
> memory for reads and other for writes.

OK, in that case please go ahead and merge.

>
> >
> > Anyway apart from that nit, this is:
> >
> > Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> >
> > >
> > > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > > Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > > Fixes: 1460bb1fef9c ("drm/xe: Force write completion of MI_STORE_DATA_IMM")
> > > Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> > > ---
> > >  drivers/gpu/drm/xe/instructions/xe_mi_commands.h | 13 ++++++-------
> > >  drivers/gpu/drm/xe/xe_migrate.c                  | 11 +++--------
> > >  drivers/gpu/drm/xe/xe_ring_ops.c                 |  6 ++----
> > >  3 files changed, 11 insertions(+), 19 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> > > index f4ee910f09432..10ec2920d31b3 100644
> > > --- a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> > > +++ b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> > > @@ -33,13 +33,12 @@
> > >  #define MI_TOPOLOGY_FILTER		__MI_INSTR(0xD)
> > >  #define MI_FORCE_WAKEUP			__MI_INSTR(0x1D)
> > >
> > > -#define MI_STORE_DATA_IMM			__MI_INSTR(0x20)
> > > -#define   MI_SDI_GGTT				REG_BIT(22)
> > > -#define   MI_FORCE_WRITE_COMPLETION_CHECK	REG_BIT(10)
> > > -#define   MI_SDI_LEN_DW				GENMASK(9, 0)
> > > -#define   MI_SDI_NUM_DW(x)			REG_FIELD_PREP(MI_SDI_LEN_DW, (x) + 3 - 2)
> > > -#define   MI_SDI_NUM_QW(x)			(REG_FIELD_PREP(MI_SDI_LEN_DW, 2 * (x) + 3 - 2) | \
> > > -						 REG_BIT(21))
> > > +#define MI_STORE_DATA_IMM		__MI_INSTR(0x20)
> > > +#define   MI_SDI_GGTT			REG_BIT(22)
> > > +#define   MI_SDI_LEN_DW			GENMASK(9, 0)
> > > +#define   MI_SDI_NUM_DW(x)		REG_FIELD_PREP(MI_SDI_LEN_DW, (x) + 3 - 2)
> > > +#define   MI_SDI_NUM_QW(x)		(REG_FIELD_PREP(MI_SDI_LEN_DW, 2 * (x) + 3 - 2) | \
> > > +					 REG_BIT(21))
> > >
> > >  #define MI_LOAD_REGISTER_IMM		__MI_INSTR(0x22)
> > >  #define   MI_LRI_LRM_CS_MMIO		REG_BIT(19)
> > > diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
> > > index 8b32fad678782..1b97d90aaddaf 100644
> > > --- a/drivers/gpu/drm/xe/xe_migrate.c
> > > +++ b/drivers/gpu/drm/xe/xe_migrate.c
> > > @@ -581,9 +581,7 @@ static void emit_pte(struct xe_migrate *m,
> > >	while (ptes) {
> > >		u32 chunk = min(MAX_PTE_PER_SDI, ptes);
> > >
> > > -		bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> > > -				    MI_FORCE_WRITE_COMPLETION_CHECK |
> > > -				    MI_SDI_NUM_QW(chunk);
> > > +		bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk);
> > >		bb->cs[bb->len++] = ofs;
> > >		bb->cs[bb->len++] = 0;
> > >
> > > @@ -1225,9 +1223,7 @@ static void write_pgtable(struct xe_tile *tile, struct xe_bb *bb, u64 ppgtt_ofs,
> > >		if (!(bb->len & 1))
> > >			bb->cs[bb->len++] = MI_NOOP;
> > >
> > > -		bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> > > -				    MI_FORCE_WRITE_COMPLETION_CHECK |
> > > -				    MI_SDI_NUM_QW(chunk);
> > > +		bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk);
> > >		bb->cs[bb->len++] = lower_32_bits(addr);
> > >		bb->cs[bb->len++] = upper_32_bits(addr);
> > >		if (pt_op->bind)
> > > @@ -1392,8 +1388,7 @@ __xe_migrate_update_pgtables(struct xe_migrate *m,
> > >			u32 idx = 0;
> > >
> > >			bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> > > -					    MI_FORCE_WRITE_COMPLETION_CHECK |
> > > -					    MI_SDI_NUM_QW(chunk);
> > > +				MI_SDI_NUM_QW(chunk);
> > >			bb->cs[bb->len++] = ofs;
> > >			bb->cs[bb->len++] = 0; /* upper_32_bits */
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
> > > index c8ab37fa0d198..9f327f27c0726 100644
> > > --- a/drivers/gpu/drm/xe/xe_ring_ops.c
> > > +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
> > > @@ -72,8 +72,7 @@ static int emit_user_interrupt(u32 *dw, int i)
> > >
> > >  static int emit_store_imm_ggtt(u32 addr, u32 value, u32 *dw, int i)
> > >  {
> > > -	dw[i++] = MI_STORE_DATA_IMM | MI_SDI_GGTT |
> > > -		  MI_FORCE_WRITE_COMPLETION_CHECK | MI_SDI_NUM_DW(1);
> > > +	dw[i++] = MI_STORE_DATA_IMM | MI_SDI_GGTT | MI_SDI_NUM_DW(1);
> > >	dw[i++] = addr;
> > >	dw[i++] = 0;
> > >	dw[i++] = value;
> > > @@ -163,8 +162,7 @@ static int emit_pipe_invalidate(u32 mask_flags, bool invalidate_tlb, u32 *dw,
> > >  static int emit_store_imm_ppgtt_posted(u64 addr, u64 value,
> > >				       u32 *dw, int i)
> > >  {
> > > -	dw[i++] = MI_STORE_DATA_IMM | MI_FORCE_WRITE_COMPLETION_CHECK |
> > > -		  MI_SDI_NUM_QW(1);
> > > +	dw[i++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(1);
> > >	dw[i++] = lower_32_bits(addr);
> > >	dw[i++] = upper_32_bits(addr);
> > >	dw[i++] = lower_32_bits(value);
> > > --
> > > 2.47.1
> > >
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Revert "drm/xe: Force write completion of MI_STORE_DATA_IMM"
  2024-12-27 17:26     ` Dixit, Ashutosh
@ 2024-12-27 17:47       ` Souza, Jose
  0 siblings, 0 replies; 4+ messages in thread
From: Souza, Jose @ 2024-12-27 17:47 UTC (permalink / raw)
  To: Dixit, Ashutosh
  Cc: intel-xe@lists.freedesktop.org, thomas.hellstrom@linux.intel.com

On Fri, 2024-12-27 at 09:26 -0800, Dixit, Ashutosh wrote:
> On Fri, 27 Dec 2024 06:24:22 -0800, Souza, Jose wrote:
> > 
> > On Thu, 2024-12-26 at 19:31 -0800, Dixit, Ashutosh wrote:
> > > On Thu, 26 Dec 2024 08:23:10 -0800, José Roberto de Souza wrote:
> > > > 
> > > > This reverts commit 1460bb1fef9ccf7390af0d74a15252442fd6effd.
> > > > 
> > > > In all places the MI_STORE_DATA_IMM are not followed by a read of
> > > > the same memory address in the same batch buffer and the posted writes
> > > > are flushed with PIPE_CONTROL or MI_FLUSH_DW in xe_ring_ops.c functions
> > > > so there is no need to set this register.
> > > 
> > > As I was saying this commit message should just be:
> > > 
> > > In all places where MI_STORE_DATA_IMM is used, posted writes are flushed
> > > with PIPE_CONTROL or MI_FLUSH_DW in xe_ring_ops.c functions, so there is no
> > > need to set this register.
> > > 
> > > So according to me the "are not followed by a read of the same memory
> > > address in the same batch buffer" is incorrect, it will work even without
> > > setting the flag.
> > 
> > Not on Xe2+, with CS_PRIORITY_MEM_READ set there is a priority path to
> > memory for reads and other for writes.
> 
> OK, in that case please go ahead and merge.

thank you, will send this again as this version was not picked by patchwork or CI because freedesktop server ran out of disk space.

> 
> > 
> > > 
> > > Anyway apart from that nit, this is:
> > > 
> > > Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > > 
> > > > 
> > > > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > > > Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
> > > > Fixes: 1460bb1fef9c ("drm/xe: Force write completion of MI_STORE_DATA_IMM")
> > > > Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/xe/instructions/xe_mi_commands.h | 13 ++++++-------
> > > >  drivers/gpu/drm/xe/xe_migrate.c                  | 11 +++--------
> > > >  drivers/gpu/drm/xe/xe_ring_ops.c                 |  6 ++----
> > > >  3 files changed, 11 insertions(+), 19 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> > > > index f4ee910f09432..10ec2920d31b3 100644
> > > > --- a/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> > > > +++ b/drivers/gpu/drm/xe/instructions/xe_mi_commands.h
> > > > @@ -33,13 +33,12 @@
> > > >  #define MI_TOPOLOGY_FILTER		__MI_INSTR(0xD)
> > > >  #define MI_FORCE_WAKEUP			__MI_INSTR(0x1D)
> > > > 
> > > > -#define MI_STORE_DATA_IMM			__MI_INSTR(0x20)
> > > > -#define   MI_SDI_GGTT				REG_BIT(22)
> > > > -#define   MI_FORCE_WRITE_COMPLETION_CHECK	REG_BIT(10)
> > > > -#define   MI_SDI_LEN_DW				GENMASK(9, 0)
> > > > -#define   MI_SDI_NUM_DW(x)			REG_FIELD_PREP(MI_SDI_LEN_DW, (x) + 3 - 2)
> > > > -#define   MI_SDI_NUM_QW(x)			(REG_FIELD_PREP(MI_SDI_LEN_DW, 2 * (x) + 3 - 2) | \
> > > > -						 REG_BIT(21))
> > > > +#define MI_STORE_DATA_IMM		__MI_INSTR(0x20)
> > > > +#define   MI_SDI_GGTT			REG_BIT(22)
> > > > +#define   MI_SDI_LEN_DW			GENMASK(9, 0)
> > > > +#define   MI_SDI_NUM_DW(x)		REG_FIELD_PREP(MI_SDI_LEN_DW, (x) + 3 - 2)
> > > > +#define   MI_SDI_NUM_QW(x)		(REG_FIELD_PREP(MI_SDI_LEN_DW, 2 * (x) + 3 - 2) | \
> > > > +					 REG_BIT(21))
> > > > 
> > > >  #define MI_LOAD_REGISTER_IMM		__MI_INSTR(0x22)
> > > >  #define   MI_LRI_LRM_CS_MMIO		REG_BIT(19)
> > > > diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
> > > > index 8b32fad678782..1b97d90aaddaf 100644
> > > > --- a/drivers/gpu/drm/xe/xe_migrate.c
> > > > +++ b/drivers/gpu/drm/xe/xe_migrate.c
> > > > @@ -581,9 +581,7 @@ static void emit_pte(struct xe_migrate *m,
> > > > 	while (ptes) {
> > > > 		u32 chunk = min(MAX_PTE_PER_SDI, ptes);
> > > > 
> > > > -		bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> > > > -				    MI_FORCE_WRITE_COMPLETION_CHECK |
> > > > -				    MI_SDI_NUM_QW(chunk);
> > > > +		bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk);
> > > > 		bb->cs[bb->len++] = ofs;
> > > > 		bb->cs[bb->len++] = 0;
> > > > 
> > > > @@ -1225,9 +1223,7 @@ static void write_pgtable(struct xe_tile *tile, struct xe_bb *bb, u64 ppgtt_ofs,
> > > > 		if (!(bb->len & 1))
> > > > 			bb->cs[bb->len++] = MI_NOOP;
> > > > 
> > > > -		bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> > > > -				    MI_FORCE_WRITE_COMPLETION_CHECK |
> > > > -				    MI_SDI_NUM_QW(chunk);
> > > > +		bb->cs[bb->len++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(chunk);
> > > > 		bb->cs[bb->len++] = lower_32_bits(addr);
> > > > 		bb->cs[bb->len++] = upper_32_bits(addr);
> > > > 		if (pt_op->bind)
> > > > @@ -1392,8 +1388,7 @@ __xe_migrate_update_pgtables(struct xe_migrate *m,
> > > > 			u32 idx = 0;
> > > > 
> > > > 			bb->cs[bb->len++] = MI_STORE_DATA_IMM |
> > > > -					    MI_FORCE_WRITE_COMPLETION_CHECK |
> > > > -					    MI_SDI_NUM_QW(chunk);
> > > > +				MI_SDI_NUM_QW(chunk);
> > > > 			bb->cs[bb->len++] = ofs;
> > > > 			bb->cs[bb->len++] = 0; /* upper_32_bits */
> > > > 
> > > > diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
> > > > index c8ab37fa0d198..9f327f27c0726 100644
> > > > --- a/drivers/gpu/drm/xe/xe_ring_ops.c
> > > > +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
> > > > @@ -72,8 +72,7 @@ static int emit_user_interrupt(u32 *dw, int i)
> > > > 
> > > >  static int emit_store_imm_ggtt(u32 addr, u32 value, u32 *dw, int i)
> > > >  {
> > > > -	dw[i++] = MI_STORE_DATA_IMM | MI_SDI_GGTT |
> > > > -		  MI_FORCE_WRITE_COMPLETION_CHECK | MI_SDI_NUM_DW(1);
> > > > +	dw[i++] = MI_STORE_DATA_IMM | MI_SDI_GGTT | MI_SDI_NUM_DW(1);
> > > > 	dw[i++] = addr;
> > > > 	dw[i++] = 0;
> > > > 	dw[i++] = value;
> > > > @@ -163,8 +162,7 @@ static int emit_pipe_invalidate(u32 mask_flags, bool invalidate_tlb, u32 *dw,
> > > >  static int emit_store_imm_ppgtt_posted(u64 addr, u64 value,
> > > > 				       u32 *dw, int i)
> > > >  {
> > > > -	dw[i++] = MI_STORE_DATA_IMM | MI_FORCE_WRITE_COMPLETION_CHECK |
> > > > -		  MI_SDI_NUM_QW(1);
> > > > +	dw[i++] = MI_STORE_DATA_IMM | MI_SDI_NUM_QW(1);
> > > > 	dw[i++] = lower_32_bits(addr);
> > > > 	dw[i++] = upper_32_bits(addr);
> > > > 	dw[i++] = lower_32_bits(value);
> > > > --
> > > > 2.47.1
> > > > 
> > 


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-12-27 17:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20241226162310.47489-1-jose.souza@intel.com>
2024-12-27  3:31 ` [PATCH] Revert "drm/xe: Force write completion of MI_STORE_DATA_IMM" Dixit, Ashutosh
2024-12-27 14:24   ` Souza, Jose
2024-12-27 17:26     ` Dixit, Ashutosh
2024-12-27 17:47       ` Souza, Jose

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.