Re: [PATCH v7 1/3] drm/xe/migrate: Atomicize CCS copy command setup

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: Matthew Brost <matthew.brost@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>
Cc: "Rodrigo Vivi" <rodrigo.vivi@intel.com>,
	"Ville Syrjälä" <ville.syrjala@linux.intel.com>,
	"K V P, Satyanarayana" <satyanarayana.k.v.p@intel.com>,
	intel-xe@lists.freedesktop.org,
	"Michal Wajdeczko" <michal.wajdeczko@intel.com>,
	"Matthew Auld" <matthew.auld@intel.com>
Subject: Re: [PATCH v7 1/3] drm/xe/migrate: Atomicize CCS copy command setup
Date: Fri, 17 Oct 2025 15:59:02 -0700	[thread overview]
Message-ID: <aPLKNpPcpffFepLo@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <20251017223516.GY1207432@mdroper-desk1.amr.corp.intel.com>

On Fri, Oct 17, 2025 at 03:35:16PM -0700, Matt Roper wrote:
> On Fri, Oct 17, 2025 at 02:21:59PM -0400, Rodrigo Vivi wrote:
> > On Fri, Oct 17, 2025 at 07:51:47PM +0300, Ville Syrjälä wrote:
> > > On Fri, Oct 17, 2025 at 09:59:48PM +0530, K V P, Satyanarayana wrote:
> > > > 
> > > > 
> > > > On 17-10-2025 20:56, Ville Syrjälä wrote:
> > > > > On Fri, Oct 17, 2025 at 08:46:37PM +0530, K V P, Satyanarayana wrote:
> > > > >>
> > > > >>
> > > > >> On 17-10-2025 19:57, Ville Syrjälä wrote:
> > > > >>> On Fri, Oct 17, 2025 at 07:42:28PM +0530, Satyanarayana K V P wrote:
> > > > >>>> The CCS copy command is a 5-dword sequence. If the vCPU halts during
> > > > >>>> save/restore while this sequence is being programmed, partial writes may
> > > > >>>> trigger page faults when saving IGPU CCS metadata. Use the VMOVDQU
> > > > >>>> instruction to write the sequence atomically.
> > > > >>>
> > > > >>> If this whole thing is so racy why don't you always add a new
> > > > >>> BB_END after new commands, and only replace the previous BB_END
> > > > >>> with NOOP _after_ the new commands have been fully written?
> > > > >>>
> > > > >> We maintain a suballocator for batch buffer management, with size
> > > > >> proportional to system memory (e.g., 16MB suballocator for 8GB SMEM).
> > > > >> Batch buffers are dynamically allocated from this pool based on the
> > > > >> number of active workloads. The entire suballocator region is submitted
> > > > >> to hardware for CCS metadata copy operations.
> > > > >>
> > > > >> We cannot insert BB_END commands after each individual instruction
> > > > >> sequence because additional GPU instructions may be appended later.
> > > > > 
> > > > > You *overwrite* the previous BB_END after the new commands have been
> > > > > appended.
> > > > We do not know where the new BB allocation will be. It may not be 
> > > > sequential and every BO has a BB. BBs are allocated and freed so often 
> > > > based on BOs getting created and destroyed. So, we can't use that approach.
> > > 
> > > Hmm, could perhaps use second level batches then. Each BO would gets
> > > its own second level batch, and the first level would just call them
> > > in sequence. Or is this already running as a second level batch?
> > 
> > This I'm not sure...
> > 
> > Matt, do you know?
> 
> My understanding of this feature is that we create two additional
> contexts (LRC's) and tell the GuC that they're special --- one should be
> scheduled whenever a VF is being stopped and the other should be
> scheduled when a VF is being started.  The intent is to use these
> contexts to do a "context switch" of the VF's CCS data --- saving it out
> when the VF is stopping and bringing it back in when the VF is resumed.
> 
> From a hardware point of view I think we could handle things however we
> like in the LRC's ring and/or batch buffers.  We could add all the
> necessary copy commands directly to the LRCs' rings if we wanted, or we
> can add them to batchbuffers, or add them to 2- or 3-level nested
> batchbuffers.  The current approach actually taken appears to be to
> allocate a large chunk of memory (ctx->mem.ccs_bb_pool) and do an
> MI_BATCH_BUFFER_START off to it.  That ccs_bb_pool is originally full of
> NOOP (nothing to do), but as the VF allocates buffers, suballocations of
> the pool are done for each buffer allocated, and those suballocations
> are filled with the necessary commands to copy the CCS data in/out.
> When a buffer is released by the VF, it's suballocation of the pool is
> wiped over with NOOPs.  Any time a VF starts/stops, the save and restore
> LRCs get scheduled by the GuC and execute their entire ccs_bb_pool as a
> single batchbuffer (mostly containing noops, but with sections of copy
> instructions scattered around).  
> 
>           Save LRC
> H +-----------------------+
>   | MI_BATCH_BUFFER_START |-------> +-------------------------------+
> T +-----------------------+         |            <noops>            |
>                                     +-------------------------------+
>                                     | instr's to copy CCS for a bo  |
>                                     +-------------------------------+
>                                     |            <noops>            |
>                                     +-------------------------------+
>                                     | instr's to copy CCS for a bo  |
>                                     +-------------------------------+
>                                     |            <noops>            |
>                                     +-------------------------------+
>                                     ~             ...               ~
>                                     +-------------------------------+
> 
> So the problem this patch is trying to address is when a suballocation
> of the pool is made, and the CPU is in the process of poking
> instructions into it with the CPU when the VF is stopped; part of the
> copy instructions will have landed in memory, other parts may not have.
> But the GuC will still execute the entire pool as one giant batchbuffer
> but the subsection that was being updated will be incomplete, possibly
> in harmful ways (e.g., an instruction started, but the addresses it
> referenced not yet filled in).
> 
> 
> The architecture document for this feature suggests the following:
> 
>         """
>         VF KMD can utilize standard cmd programming technique like
>         Shadow cmd buffer or ping/pong and swap BB_Start address to
>         avoid partially-updated BB is executed by GuC in case of middle
>         update pause.
>         """

I totally missed this in the SaS. I wonder if this was added recently or
I'm just bad reading.

> 
> So it sounds like you could have two "save LRCs," make the updates to
> the one that's currently inactive, and then tell the GuC to replace the
> current save context with the other one once you finish an update (if
> the GuC interface lets you do that --- I haven't checked).  Any time the
> GuC actually starts running something, it's a complete LRC, ring, and
> batchbuffer, and there are no racing updates to those from the VF KMD.
> 
> Alternatively, you could stick with a single context for each, but just
> create a shadow batch buffer instead of a whole shadow context and then
> patch the address that the MI_BATCH_BUFFER_START is jumping to after you
> update the inactive buffer.  If the batchbuffer is in the GGTT, then we
> only need an uninterrupted 32-bit update since the upper dword of the
> address is always 0.

Yes, I don't think it would be "two saved LRCs" rather, "two saved
BOs.".

On the surface, using two BOs sounds promising, but we’d need a
device-level lock to ensure that at most one thread controls which BO is
the shadow and which one the GuC sees. That might be fine, but it’s
something that needs to be considered.

Matt

> 
> There are probably also ways you could use MI_PREDICATE and/or
> MI_COND_BATCH_BUFFER_END to make the hardware execute the "ready"
> batchbuffer and not the "inactive, updating" batchbuffer without having
> to go back and patch the ring contents via the CPU, but I'd need to
> refresh my memory on the exact usage of those kinds of instructions.
> 
> 
> Matt
> 
> 
> > 
> > > 
> > > It might also be getting a bit complicated I guess, but at least it
> > > wouldn't have all obvious problems of the SIMD stuff:
> > > - looks like it will explode on non-AVX capable x86
> > > - will be broken on other arches until someone implements the equivalent
> > >   code (assuming the arch has such an atomic copy instruction
> > >   and supports in kernel SIMD stuff sufficiently to use it)
> > 
> > This is Pantherlake only. And the reason why I asked to add a
> > check with error/warn for IS_DGFX()...
> > 
> > which by the way is an assert... I still don't believe it is enough.
> > I believe a return with warn_on seems more appropriate to really
> > never try to run that code in case of a big future mistake.
> > 
> > > 
> > > > 
> > > > -Satya.>
> > > > >> Instead, a single BB_END marker is placed at the suballocator's end to
> > > > >> terminate execution.
> > > > >>
> > > > >> This patch ensures race-condition-safe CCS metadata save/restore
> > > > >> operations by guaranteeing atomic writes to the batch buffer, preventing
> > > > >> corruption regardless of when save/restore operations are triggered.
> > > > >>
> > > > >> -Satya.>>
> > > > >>>> Since VMOVDQU operates on 256-bit chunks, update EMIT_COPY_CCS_DW to emit
> > > > >>>> 8 dwords instead of 5 dwords.
> > > > >>>>
> > > > >>>> Update emit_flush_invalidate() to use VMOVDQU operating with 128-bit
> > > > >>>> chunks.
> > > > >>>>
> > > > >>>> Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
> > > > >>>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > > > >>>> Cc: Matthew Brost <matthew.brost@intel.com>
> > > > >>>> Cc: Matthew Auld <matthew.auld@intel.com>
> > > > >>>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > >>>> Cc: Matt Roper <matthew.d.roper@intel.com>
> > > > >>>>
> > > > >>>> ---
> > > > >>>> V6 -> V7:
> > > > >>>> - Added description explaining why to use assembly instructions for
> > > > >>>> atomicity.
> > > > >>>> - Assert if DGFX tries to use memcpy_vmovdqu(). (Rodrigo)
> > > > >>>> - Include <asm/cpufeature.h> though checkpatch complains. With
> > > > >>>> <linux/cpufeature.h> KUnit is throwing errors.
> > > > >>>>
> > > > >>>> V5 -> V6:
> > > > >>>> - Fixed review comments (Rodrigo)
> > > > >>>>
> > > > >>>> V4 -> V5:
> > > > >>>> - Fixed review comments. (Matt B)
> > > > >>>>
> > > > >>>> V3 -> V4:
> > > > >>>> - Fixed review comments. (Wajdeczko)
> > > > >>>> - Fix issues reported by patchworks.
> > > > >>>>
> > > > >>>> V2 -> V3:
> > > > >>>> - Added support for 128 bit and 256 bit instructions with memcpy_vmovdqu
> > > > >>>> - Updated emit_flush_invalidate() to use vmovdqu instruction.
> > > > >>>>
> > > > >>>> V1 -> V2:
> > > > >>>> - Use memcpy_vmovdqu only for x86 arch and for VF. Else use memcpy
> > > > >>>>     (Auld, Matthew)
> > > > >>>>     - Fix issues reported by patchworks.
> > > > >>>> ---
> > > > >>>>    drivers/gpu/drm/xe/xe_migrate.c | 112 ++++++++++++++++++++++++++------
> > > > >>>>    1 file changed, 91 insertions(+), 21 deletions(-)
> > > > >>>>
> > > > >>>> diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
> > > > >>>> index 3112c966c67d..e0be7396a0ab 100644
> > > > >>>> --- a/drivers/gpu/drm/xe/xe_migrate.c
> > > > >>>> +++ b/drivers/gpu/drm/xe/xe_migrate.c
> > > > >>>> @@ -5,6 +5,8 @@
> > > > >>>>    
> > > > >>>>    #include "xe_migrate.h"
> > > > >>>>    
> > > > >>>> +#include <asm/fpu/api.h>
> > > > >>>> +#include <asm/cpufeature.h>
> > > > >>>>    #include <linux/bitfield.h>
> > > > >>>>    #include <linux/sizes.h>
> > > > >>>>    
> > > > >>>> @@ -33,6 +35,7 @@
> > > > >>>>    #include "xe_res_cursor.h"
> > > > >>>>    #include "xe_sa.h"
> > > > >>>>    #include "xe_sched_job.h"
> > > > >>>> +#include "xe_sriov_vf_ccs.h"
> > > > >>>>    #include "xe_sync.h"
> > > > >>>>    #include "xe_trace_bo.h"
> > > > >>>>    #include "xe_validation.h"
> > > > >>>> @@ -657,18 +660,68 @@ static void emit_pte(struct xe_migrate *m,
> > > > >>>>    	}
> > > > >>>>    }
> > > > >>>>    
> > > > >>>> -#define EMIT_COPY_CCS_DW 5
> > > > >>>> +/*
> > > > >>>> + * VF KMD registers two specialized LRCs with the GuC to handle save/restore
> > > > >>>> + * operations for CCS metadata on IGPU. The GuC executes these LRCAs during
> > > > >>>> + * VF state/restore operations.
> > > > >>>> + *
> > > > >>>> + * Each LRC contains a batch buffer pool that GuC submits to hardware during
> > > > >>>> + * VF state save/restore operations. Since these operations can occur
> > > > >>>> + * asynchronously at any time, we must ensure GPU instructions in the batch
> > > > >>>> + * buffer are written atomically to prevent corruption from incomplete writes.
> > > > >>>> + *
> > > > >>>> + * To guarantee atomic instruction writes, we use x86 SIMD instructions
> > > > >>>> + * (128-bit XMM and 256-bit YMM) within kernel_fpu_begin()/kernel_fpu_end()
> > > > >>>> + * sections. This prevents vCPU preemption during instruction generation,
> > > > >>>> + * ensuring complete GPU commands are written to the batch buffer.
> > > > >>>> + */
> > > > >>>> +
> > > > >>>> +static void memcpy_vmovdqu(struct xe_device *xe, void *dst, const void *src, u32 size)
> > > > >>>> +{
> > > > >>>> +	xe_assert(xe, !IS_DGFX(xe));
> > > > >>>> +#ifdef CONFIG_X86
> > > > >>>> +	kernel_fpu_begin();
> > > > >>>> +	if (size == SZ_128) {
> > > > >>>> +		asm("vmovdqu (%0), %%xmm0\n"
> > > > >>>> +		    "vmovups %%xmm0,   (%1)\n"
> > > > >>>> +		    :: "r" (src), "r" (dst) : "memory");
> > > > >>>> +	} else if (size == SZ_256) {
> > > > >>>> +		asm("vmovdqu (%0), %%ymm0\n"
> > > > >>>> +		    "vmovups %%ymm0,   (%1)\n"
> > > > >>>> +		    :: "r" (src), "r" (dst) : "memory");
> > > > >>>> +	}
> > > > >>>> +	kernel_fpu_end();
> > > > >>>> +#endif
> > > > >>>> +}
> > > > >>>> +
> > > > >>>> +static void emit_atomic(struct xe_gt *gt, void *dst, const void *src, u32 size)
> > > > >>>> +{
> > > > >>>> +	u32 instr_size = size * BITS_PER_BYTE;
> > > > >>>> +
> > > > >>>> +	xe_gt_assert(gt, instr_size == SZ_128 || instr_size == SZ_256);
> > > > >>>> +
> > > > >>>> +	if (IS_VF_CCS_READY(gt_to_xe(gt))) {
> > > > >>>> +		xe_gt_assert(gt, static_cpu_has(X86_FEATURE_AVX));
> > > > >>>> +		memcpy_vmovdqu(gt_to_xe(gt), dst, src, instr_size);
> > > > >>>> +	} else {
> > > > >>>> +		memcpy(dst, src, size);
> > > > >>>> +	}
> > > > >>>> +}
> > > > >>>> +
> > > > >>>> +#define EMIT_COPY_CCS_DW 8
> > > > >>>>    static void emit_copy_ccs(struct xe_gt *gt, struct xe_bb *bb,
> > > > >>>>    			  u64 dst_ofs, bool dst_is_indirect,
> > > > >>>>    			  u64 src_ofs, bool src_is_indirect,
> > > > >>>>    			  u32 size)
> > > > >>>>    {
> > > > >>>> +	u32 dw[EMIT_COPY_CCS_DW] = {MI_NOOP};
> > > > >>>>    	struct xe_device *xe = gt_to_xe(gt);
> > > > >>>>    	u32 *cs = bb->cs + bb->len;
> > > > >>>>    	u32 num_ccs_blks;
> > > > >>>>    	u32 num_pages;
> > > > >>>>    	u32 ccs_copy_size;
> > > > >>>>    	u32 mocs;
> > > > >>>> +	u32 i = 0;
> > > > >>>>    
> > > > >>>>    	if (GRAPHICS_VERx100(xe) >= 2000) {
> > > > >>>>    		num_pages = DIV_ROUND_UP(size, XE_PAGE_SIZE);
> > > > >>>> @@ -686,15 +739,23 @@ static void emit_copy_ccs(struct xe_gt *gt, struct xe_bb *bb,
> > > > >>>>    		mocs = FIELD_PREP(XY_CTRL_SURF_MOCS_MASK, gt->mocs.uc_index);
> > > > >>>>    	}
> > > > >>>>    
> > > > >>>> -	*cs++ = XY_CTRL_SURF_COPY_BLT |
> > > > >>>> -		(src_is_indirect ? 0x0 : 0x1) << SRC_ACCESS_TYPE_SHIFT |
> > > > >>>> -		(dst_is_indirect ? 0x0 : 0x1) << DST_ACCESS_TYPE_SHIFT |
> > > > >>>> -		ccs_copy_size;
> > > > >>>> -	*cs++ = lower_32_bits(src_ofs);
> > > > >>>> -	*cs++ = upper_32_bits(src_ofs) | mocs;
> > > > >>>> -	*cs++ = lower_32_bits(dst_ofs);
> > > > >>>> -	*cs++ = upper_32_bits(dst_ofs) | mocs;
> > > > >>>> +	dw[i++] = XY_CTRL_SURF_COPY_BLT |
> > > > >>>> +		  (src_is_indirect ? 0x0 : 0x1) << SRC_ACCESS_TYPE_SHIFT |
> > > > >>>> +		  (dst_is_indirect ? 0x0 : 0x1) << DST_ACCESS_TYPE_SHIFT |
> > > > >>>> +		  ccs_copy_size;
> > > > >>>> +	dw[i++] = lower_32_bits(src_ofs);
> > > > >>>> +	dw[i++] = upper_32_bits(src_ofs) | mocs;
> > > > >>>> +	dw[i++] = lower_32_bits(dst_ofs);
> > > > >>>> +	dw[i++] = upper_32_bits(dst_ofs) | mocs;
> > > > >>>>    
> > > > >>>> +	/*
> > > > >>>> +	 * The CCS copy command is a 5-dword sequence. If the vCPU halts during
> > > > >>>> +	 * save/restore while this sequence is being issued, partial writes may trigger
> > > > >>>> +	 * page faults when saving iGPU CCS metadata. Use the VMOVDQU instruction to
> > > > >>>> +	 * write the sequence atomically.
> > > > >>>> +	 */
> > > > >>>> +	emit_atomic(gt, cs, dw, sizeof(dw));
> > > > >>>> +	cs += EMIT_COPY_CCS_DW;
> > > > >>>>    	bb->len = cs - bb->cs;
> > > > >>>>    }
> > > > >>>>    
> > > > >>>> @@ -1006,18 +1067,27 @@ static u64 migrate_vm_ppgtt_addr_tlb_inval(void)
> > > > >>>>    	return (NUM_KERNEL_PDE - 2) * XE_PAGE_SIZE;
> > > > >>>>    }
> > > > >>>>    
> > > > >>>> -static int emit_flush_invalidate(u32 *dw, int i, u32 flags)
> > > > >>>> +/*
> > > > >>>> + * The MI_FLUSH_DW command is a 4-dword sequence. If the vCPU halts during
> > > > >>>> + * save/restore while this sequence is being issued, partial writes may
> > > > >>>> + * trigger page faults when saving iGPU CCS metadata. Use
> > > > >>>> + * emit_atomic() to write the sequence atomically.
> > > > >>>> + */
> > > > >>>> +#define EMIT_FLUSH_INVALIDATE_DW 4
> > > > >>>> +static int emit_flush_invalidate(struct xe_exec_queue *q, u32 *cs, int i, u32 flags)
> > > > >>>>    {
> > > > >>>>    	u64 addr = migrate_vm_ppgtt_addr_tlb_inval();
> > > > >>>> +	u32 dw[EMIT_FLUSH_INVALIDATE_DW] = {MI_NOOP}, j = 0;
> > > > >>>> +
> > > > >>>> +	dw[j++] = MI_FLUSH_DW | MI_INVALIDATE_TLB | MI_FLUSH_DW_OP_STOREDW |
> > > > >>>> +		      MI_FLUSH_IMM_DW | flags;
> > > > >>>> +	dw[j++] = lower_32_bits(addr);
> > > > >>>> +	dw[j++] = upper_32_bits(addr);
> > > > >>>> +	dw[j++] = MI_NOOP;
> > > > >>>>    
> > > > >>>> -	dw[i++] = MI_FLUSH_DW | MI_INVALIDATE_TLB | MI_FLUSH_DW_OP_STOREDW |
> > > > >>>> -		  MI_FLUSH_IMM_DW | flags;
> > > > >>>> -	dw[i++] = lower_32_bits(addr);
> > > > >>>> -	dw[i++] = upper_32_bits(addr);
> > > > >>>> -	dw[i++] = MI_NOOP;
> > > > >>>> -	dw[i++] = MI_NOOP;
> > > > >>>> +	emit_atomic(q->gt, &cs[i], dw, sizeof(dw));
> > > > >>>>    
> > > > >>>> -	return i;
> > > > >>>> +	return i + j;
> > > > >>>>    }
> > > > >>>>    
> > > > >>>>    /**
> > > > >>>> @@ -1062,7 +1132,7 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q,
> > > > >>>>    	/* Calculate Batch buffer size */
> > > > >>>>    	batch_size = 0;
> > > > >>>>    	while (size) {
> > > > >>>> -		batch_size += 10; /* Flush + ggtt addr + 2 NOP */
> > > > >>>> +		batch_size += EMIT_FLUSH_INVALIDATE_DW * 2; /* Flush + ggtt addr + 1 NOP */
> > > > >>>>    		u64 ccs_ofs, ccs_size;
> > > > >>>>    		u32 ccs_pt;
> > > > >>>>    
> > > > >>>> @@ -1103,7 +1173,7 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q,
> > > > >>>>    	 * sizes here again before copy command is emitted.
> > > > >>>>    	 */
> > > > >>>>    	while (size) {
> > > > >>>> -		batch_size += 10; /* Flush + ggtt addr + 2 NOP */
> > > > >>>> +		batch_size += EMIT_FLUSH_INVALIDATE_DW * 2; /* Flush + ggtt addr + 1 NOP */
> > > > >>>>    		u32 flush_flags = 0;
> > > > >>>>    		u64 ccs_ofs, ccs_size;
> > > > >>>>    		u32 ccs_pt;
> > > > >>>> @@ -1126,11 +1196,11 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q,
> > > > >>>>    
> > > > >>>>    		emit_pte(m, bb, ccs_pt, false, false, &ccs_it, ccs_size, src);
> > > > >>>>    
> > > > >>>> -		bb->len = emit_flush_invalidate(bb->cs, bb->len, flush_flags);
> > > > >>>> +		bb->len = emit_flush_invalidate(q, bb->cs, bb->len, flush_flags);
> > > > >>>>    		flush_flags = xe_migrate_ccs_copy(m, bb, src_L0_ofs, src_is_pltt,
> > > > >>>>    						  src_L0_ofs, dst_is_pltt,
> > > > >>>>    						  src_L0, ccs_ofs, true);
> > > > >>>> -		bb->len = emit_flush_invalidate(bb->cs, bb->len, flush_flags);
> > > > >>>> +		bb->len = emit_flush_invalidate(q, bb->cs, bb->len, flush_flags);
> > > > >>>>    
> > > > >>>>    		size -= src_L0;
> > > > >>>>    	}
> > > > >>>> -- 
> > > > >>>> 2.51.0
> > > > >>>
> > > > > 
> > > 
> > > -- 
> > > Ville Syrjälä
> > > Intel
> 
> -- 
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation

next prev parent reply	other threads:[~2025-10-18  2:50 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-17 14:12 [PATCH v7 0/3] drm/xe/migrate: Atomicize CCS copy command setup Satyanarayana K V P
2025-10-17 14:12 ` [PATCH v7 1/3] " Satyanarayana K V P
2025-10-17 14:27   ` Ville Syrjälä
2025-10-17 15:16     ` K V P, Satyanarayana
2025-10-17 15:26       ` Ville Syrjälä
2025-10-17 16:29         ` K V P, Satyanarayana
2025-10-17 16:41           ` Rodrigo Vivi
2025-10-17 16:51           ` Ville Syrjälä
2025-10-17 18:21             ` Rodrigo Vivi
2025-10-17 22:35               ` Matthew Brost
2025-10-17 22:45                 ` Matt Roper
2025-10-17 22:35               ` Matt Roper
2025-10-17 22:59                 ` Matthew Brost [this message]
2025-10-17 18:11   ` Ville Syrjälä
2025-10-17 18:24     ` Rodrigo Vivi
2025-10-17 14:12 ` [PATCH v7 2/3] drm/xe/migrate: Make emit_pte() header write atomic Satyanarayana K V P
2025-10-17 14:12 ` [PATCH v7 3/3] drm/xe/vf: Clear CCS read/write buffers in atomic way Satyanarayana K V P
2025-10-17 14:17 ` ✗ CI.checkpatch: warning for drm/xe/migrate: Atomicize CCS copy command setup Patchwork
2025-10-17 14:18 ` ✓ CI.KUnit: success " Patchwork
2025-10-17 15:23 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-18 12:27 ` ✗ Xe.CI.Full: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPLKNpPcpffFepLo@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.auld@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=michal.wajdeczko@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=satyanarayana.k.v.p@intel.com \
    --cc=ville.syrjala@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox