From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DB638C021A4 for ; Mon, 24 Feb 2025 13:43:10 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0565610E3B9; Mon, 24 Feb 2025 13:42:43 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="KaguX26z"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.17]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6FD2510E036 for ; Mon, 24 Feb 2025 08:07:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1740384425; x=1771920425; h=message-id:subject:from:to:cc:date:in-reply-to: references:content-transfer-encoding:mime-version; bh=dFIBQyaBeG9p6qtLlU09jjFLCFZw8OaDd61aospKyF0=; b=KaguX26zAgA0P6mPELxWsF3PbqloZA7wmJ/x/NZ8Pbru+5jqgVyhky5/ z9aQxt4BqABjGQcVBssCkVTjRwvPf+xP1mFAXcmsE5VRNBAGBqtWtArCq ITmfKCEQ8cMyeRfe2zEvpoyldpXDAWurv+WU4kuDRdlXRXwRSCenOlG09 dFacCfBvpxQM+2PckoVM8Nn45TMXCSW1sLgd28Cm7k6sQGBR8FPbMN9fK 8vQGSPm7i3d8Yvds0bACNLi89FA7ESAGhazDMBjFosQyOyFYlS/3y4tWp sKOjWqxx5pIlbnkuOy3vb9FYaHBFQ8NG/lbqhgFVbydG1lgBzBDs466H+ w==; X-CSE-ConnectionGUID: hAmwRqVpRFGBEZlT7v28Eg== X-CSE-MsgGUID: 5xJNCQunS66d5tYsxU3e9A== X-IronPort-AV: E=McAfee;i="6700,10204,11354"; a="41143742" X-IronPort-AV: E=Sophos;i="6.13,309,1732608000"; d="scan'208";a="41143742" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by orvoesa109.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Feb 2025 00:07:05 -0800 X-CSE-ConnectionGUID: v5iwkTMBSBiCW9oqek8BRQ== X-CSE-MsgGUID: XckpJEfRTQ230Q7NcnV6Lw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,309,1732608000"; d="scan'208";a="120962944" Received: from jkrzyszt-mobl2.ger.corp.intel.com (HELO [10.245.246.134]) ([10.245.246.134]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Feb 2025 00:07:03 -0800 Message-ID: Subject: Re: [PATCH 2/2] drm/xe: Userptr invalidation race with binds fixes From: Thomas =?ISO-8859-1?Q?Hellstr=F6m?= To: Matthew Brost Cc: intel-xe@lists.freedesktop.org, matthew.auld@intel.com Date: Mon, 24 Feb 2025 09:06:58 +0100 In-Reply-To: References: <20250220234557.2613010-1-matthew.brost@intel.com> <20250220234557.2613010-3-matthew.brost@intel.com> <7685842d544a4e9673dc4ec2645044a8ded58848.camel@linux.intel.com> Organization: Intel Sweden AB, Registration Number: 556189-6027 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.54.3 (3.54.3-1.fc41) MIME-Version: 1.0 X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" Hi, On Fri, 2025-02-21 at 08:59 -0800, Matthew Brost wrote: > On Fri, Feb 21, 2025 at 04:27:36PM +0100, Thomas Hellstr=C3=B6m wrote: > > On Fri, 2025-02-21 at 06:43 -0800, Matthew Brost wrote: > > > On Fri, Feb 21, 2025 at 12:34:12PM +0100, Thomas Hellstr=C3=B6m wrote= : > > > > On Thu, 2025-02-20 at 15:45 -0800, Matthew Brost wrote: > > > > > Squash bind operation after userptr invalidation into a > > > > > clearing > > > > > of > > > > > PTEs. Will prevent valid GPU page tables from pointing to > > > > > stale > > > > > CPU > > > > > pages. > > > > >=20 > > > > > Fixup initial bind handling always add VMAs to invalidation > > > > > list > > > > > and > > > > > clear PTEs. > > > > >=20 > > > > > Remove unused rebind variable in xe_pt. > > > > >=20 > > > > > Always hold notifier across TLB invalidation in notifier to > > > > > prevent a > > > > > UAF if an unbind races. > > > > >=20 > > > > > Including all of the above changes for Fixes patch in hopes > > > > > of an > > > > > easier > > > > > backport which fix a single patch. > > > > >=20 > > > > > Cc: Thomas Hellstr=C3=B6m > > > > > Cc: > > > > > Fixes: e8babb280b5e: ("drm/xe: Convert multiple bind ops into > > > > > single > > > > > job") > > > > > Signed-off-by: Matthew Brost > > > > > --- > > > > > =C2=A0drivers/gpu/drm/xe/regs/xe_gtt_defs.h |=C2=A0 1 + > > > > > =C2=A0drivers/gpu/drm/xe/xe_pt.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | 86 > > > > > +++++++++++++++++++-- > > > > > ---- > > > > > -- > > > > > =C2=A0drivers/gpu/drm/xe/xe_pt_types.h=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 |=C2=A0 1 - > > > > > =C2=A0drivers/gpu/drm/xe/xe_vm.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0 4 +- > > > > > =C2=A04 files changed, 64 insertions(+), 28 deletions(-) > > > > >=20 > > > > > diff --git a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h > > > > > b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h > > > > > index 4389e5a76f89..ab281e721d94 100644 > > > > > --- a/drivers/gpu/drm/xe/regs/xe_gtt_defs.h > > > > > +++ b/drivers/gpu/drm/xe/regs/xe_gtt_defs.h > > > > > @@ -13,6 +13,7 @@ > > > > > =C2=A0 > > > > > =C2=A0#define GUC_GGTT_TOP 0xFEE00000 > > > > > =C2=A0 > > > > > +#define XE_PPGTT_IS_LEAF BIT_ULL(63) > > > > > =C2=A0#define XELPG_PPGTT_PTE_PAT3 BIT_ULL(62) > > > > > =C2=A0#define XE2_PPGTT_PTE_PAT4 BIT_ULL(61) > > > > > =C2=A0#define XE_PPGTT_PDE_PDPE_PAT2 BIT_ULL(12) > > > > > diff --git a/drivers/gpu/drm/xe/xe_pt.c > > > > > b/drivers/gpu/drm/xe/xe_pt.c > > > > > index 1ddcc7e79a93..26b513a54113 100644 > > > > > --- a/drivers/gpu/drm/xe/xe_pt.c > > > > > +++ b/drivers/gpu/drm/xe/xe_pt.c > > > > > @@ -389,6 +389,8 @@ xe_pt_insert_entry(struct > > > > > xe_pt_stage_bind_walk > > > > > *xe_walk, struct xe_pt *parent, > > > > > =C2=A0 idx =3D offset - entry->ofs; > > > > > =C2=A0 entry->pt_entries[idx].pt =3D xe_child; > > > > > =C2=A0 entry->pt_entries[idx].pte =3D pte; > > > > > + if (!xe_child) > > > > > + entry->pt_entries[idx].pte |=3D > > > > > XE_PPGTT_IS_LEAF; > > > >=20 > > > > I don't think this is needed. > > > >=20 > > >=20 > > > It is. Will explain below. > > >=20 > > > > > =C2=A0 entry->qwords++; > > > > > =C2=A0 } > > > > > =C2=A0 > > > > > @@ -792,6 +794,24 @@ static const struct xe_pt_walk_ops > > > > > xe_pt_zap_ptes_ops =3D { > > > > > =C2=A0 .pt_entry =3D xe_pt_zap_ptes_entry, > > > > > =C2=A0}; > > > > > =C2=A0 > > > > > +static bool __xe_pt_zap_ptes(struct xe_tile *tile, struct > > > > > xe_vma > > > > > *vma) > > > > > +{ > > > > > + struct xe_pt_zap_ptes_walk xe_walk =3D { > > > > > + .base =3D { > > > > > + .ops =3D &xe_pt_zap_ptes_ops, > > > > > + .shifts =3D xe_normal_pt_shifts, > > > > > + .max_level =3D XE_PT_HIGHEST_LEVEL, > > > > > + }, > > > > > + .tile =3D tile, > > > > > + }; > > > > > + struct xe_pt *pt =3D xe_vma_vm(vma)->pt_root[tile- > > > > > >id]; > > > > > + > > > > > + (void)xe_pt_walk_shared(&pt->base, pt->level, > > > > > xe_vma_start(vma), > > > > > + xe_vma_end(vma), > > > > > &xe_walk.base); > > > > > + > > > > > + return xe_walk.needs_invalidate; > > > > > +} > > > > > + > > > > > =C2=A0/** > > > > > =C2=A0 * xe_pt_zap_ptes() - Zap (zero) gpu ptes of an address > > > > > range > > > > > =C2=A0 * @tile: The tile we're zapping for. > > > > > @@ -810,24 +830,12 @@ static const struct xe_pt_walk_ops > > > > > xe_pt_zap_ptes_ops =3D { > > > > > =C2=A0 */ > > > > > =C2=A0bool xe_pt_zap_ptes(struct xe_tile *tile, struct xe_vma > > > > > *vma) > > > > > =C2=A0{ > > > > > - struct xe_pt_zap_ptes_walk xe_walk =3D { > > > > > - .base =3D { > > > > > - .ops =3D &xe_pt_zap_ptes_ops, > > > > > - .shifts =3D xe_normal_pt_shifts, > > > > > - .max_level =3D XE_PT_HIGHEST_LEVEL, > > > > > - }, > > > > > - .tile =3D tile, > > > > > - }; > > > > > - struct xe_pt *pt =3D xe_vma_vm(vma)->pt_root[tile- > > > > > >id]; > > > > > =C2=A0 u8 pt_mask =3D (vma->tile_present & ~vma- > > > > > > tile_invalidated); > > > > > =C2=A0 > > > > > =C2=A0 if (!(pt_mask & BIT(tile->id))) > > > > > =C2=A0 return false; > > > > > =C2=A0 > > > > > - (void)xe_pt_walk_shared(&pt->base, pt->level, > > > > > xe_vma_start(vma), > > > > > - xe_vma_end(vma), > > > > > &xe_walk.base); > > > > > - > > > > > - return xe_walk.needs_invalidate; > > > > > + return __xe_pt_zap_ptes(tile, vma); > > > > > =C2=A0} > > > > > =C2=A0 > > > > > =C2=A0static void > > > > > @@ -843,9 +851,10 @@ xe_vm_populate_pgtable(struct > > > > > xe_migrate_pt_update *pt_update, struct xe_tile *t > > > > > =C2=A0 for (i =3D 0; i < num_qwords; i++) { > > > > > =C2=A0 if (map) > > > > > =C2=A0 xe_map_wr(tile_to_xe(tile), map, > > > > > (qword_ofs > > > > > + i) * > > > > > - =C2=A0 sizeof(u64), u64, > > > > > ptes[i].pte); > > > > > + =C2=A0 sizeof(u64), u64, > > > > > ptes[i].pte > > > > > & > > > > > + =C2=A0 ~XE_PPGTT_IS_LEAF); > > > > > =C2=A0 else > > > > > - ptr[i] =3D ptes[i].pte; > > > > > + ptr[i] =3D ptes[i].pte & > > > > > ~XE_PPGTT_IS_LEAF; > > > > > =C2=A0 } > > > > > =C2=A0} > > > > > =C2=A0 > > > > > @@ -1201,7 +1210,30 @@ static bool > > > > > xe_pt_userptr_inject_eagain(struct > > > > > xe_userptr_vma *uvma) > > > > > =C2=A0 > > > > > =C2=A0#endif > > > > > =C2=A0 > > > > > -static int vma_check_userptr(struct xe_vm *vm, struct xe_vma > > > > > *vma, > > > > > +static void > > > > > +vma_convert_to_invalidation(struct xe_tile *tile, struct > > > > > xe_vma > > > > > *vma, > > > > > + =C2=A0=C2=A0=C2=A0 struct xe_vm_pgtable_update_ops > > > > > *pt_update) > > > > > +{ > > > > > + int i, j, k; > > > > > + > > > > > + for (i =3D 0; i < pt_update->current_op; ++i) { > > > > > + struct xe_vm_pgtable_update_op *op =3D > > > > > &pt_update- > > > > > > ops[i]; > > > > > + > > > > > + if (vma =3D=3D op->vma && (op->bind || op- > > > > > >rebind)) > > > > > { > > > > > + for (j =3D 0; j < op->num_entries; > > > > > ++j) { > > > > > + for (k =3D 0; k < op- > > > > > > entries[j].qwords; ++k) > > > > > + if (op- > > > > > > entries[j].pt_entries[k].pte & > > > > > + =C2=A0=C2=A0=C2=A0 > > > > > XE_PPGTT_IS_LEAF) > > > >=20 > > > > IIRC It's a leaf iff op->entries[j].pt->level =3D=3D 0 || > > > > xe_pte_is_huge(pte) (xe_pte_is_huge(pte) needs to be coded) > > > >=20 > > >=20 > > > The 'op->entries[j].pt' here is overidden at this point as part > > > of > > > the > > > code which handles the unwind on error. See > > > xe_pt_commit_prepare_bind. > > >=20 > > > We can't reason whether the pte is a leaf without knowing the > > > level, > > > thus the extra bit here. Ofc this could be coded differently. > > > Open to > > > ideas but this might not needed if we just return -EAGAIN - see > > > below.=20 > > >=20 > > > > > + op- > > > > > > entries[j].pt_entries[k].pte =3D 0; > > > >=20 > > > > This should be a scratch pte unless it's a faulting VM? > > > >=20 > > >=20 > > > Yes, if the VM is in scratch mode, this should be set scratch. > > >=20 > > > >=20 > > > > > + } > > > > > + } > > > > > + } > > > > > + > > > > > + __xe_pt_zap_ptes(tile, vma); > > > >=20 > > > > Umm, A zap here would race completely with other pipelined > > > > updates > > > > right? Why is it needed? > > > >=20 > > >=20 > > > The PT tree is fully updated at this point. The nature of an > > > array of > > > binds means each operation must fully updated the PT tree before > > > running > > > the next operation. Thus is safe to call zap here. > > >=20 > > > This is needed for invalidating leafs which are part of an > > > unconnected > > > tree always setup by the CPU. > > >=20 > > > e.g. A 4k bind on empty does this: > > >=20 > > > - 1 PT op to prune in a PTE at the highest level > > > - (Highest level - 1) to 0 are part of an unconnected tree setup > > > by > > > CPU > > >=20 > > > Here we'd only want to invalidate the level 0 PTE which > > > __xe_pt_zap_ptes > > > does. > > >=20 > > > What is missing, as above, the zap would need program a scratch > > > entry > > > in > > > the VM has scratch enabled. > >=20 > > What I meant was let's say you have *pipelined* a number of bind- > > exec- > > unbinds, Then the zap might be landing between a previous bind and > > exec, since it's touching shared page-tables. The userptr notifier > > waits for all previous PT modifications to complete to avoid this. > >=20 > > The rule we have is pt structure updates are done sync, but pt > > value > > updates are queued, unless they are touching nonshared pagetables > > that > > aren't published yet. > >=20 >=20 > So we'd have to wait here on BOOKKEEP slots before calling zap? I > thought zap only touched leaf entries? >=20 > e.g. VM has single a single VMA 0x0000-0x2000, zap would invalidate 2 > PTEs at 0x0000-0x1000 and 0x1000-0x2000 not the higher level PDEs > even > though PDEs are not shared in the current structure. >=20 > If zap hits the PDEs, then yea this would be an issue with > pipelining. It hits the shared page-tables, so yes it will hit also higher levels. But note that even if it hits level 0 shared page-tables (leaves), like in your exempale, there might be a pipelined bind to those still not executed. Question is, though, do we really need this zap if we also modify the non-leaf PTE entries to be 0 / scratch pages? That would be pretty much the same as a zap?=20 /Thomas >=20 > > >=20 > > > > > +} > > > > > + > > > > > +static int vma_check_userptr(struct xe_tile *tile, struct > > > > > xe_vm > > > > > *vm, > > > > > + =C2=A0=C2=A0=C2=A0=C2=A0 struct xe_vma *vma, > > > > > =C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0 struct xe_vm_pgtable_update_ops > > > > > *pt_update) > > > > > =C2=A0{ > > > > > =C2=A0 struct xe_userptr_vma *uvma; > > > > > @@ -1215,9 +1247,6 @@ static int vma_check_userptr(struct > > > > > xe_vm > > > > > *vm, > > > > > struct xe_vma *vma, > > > > > =C2=A0 uvma =3D to_userptr_vma(vma); > > > > > =C2=A0 notifier_seq =3D uvma->userptr.notifier_seq; > > > > > =C2=A0 > > > > > - if (uvma->userptr.initial_bind && > > > > > !xe_vm_in_fault_mode(vm)) > > > > > - return 0; > > > > > - > > > > > =C2=A0 if (!mmu_interval_read_retry(&uvma- > > > > > >userptr.notifier, > > > > > =C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0 notifier_seq) && > > > > > =C2=A0 =C2=A0=C2=A0=C2=A0 !xe_pt_userptr_inject_eagain(uvma)) > > > > > @@ -1231,6 +1260,8 @@ static int vma_check_userptr(struct > > > > > xe_vm > > > > > *vm, > > > > > struct xe_vma *vma, > > > > > =C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 &vm->userptr.invali= dated); > > > > > =C2=A0 spin_unlock(&vm->userptr.invalidated_lock); > > > > > =C2=A0 > > > > > + vma_convert_to_invalidation(tile, vma, > > > > > pt_update); > > > > > + > > > > > =C2=A0 if (xe_vm_in_preempt_fence_mode(vm)) { > > > > > =C2=A0 struct dma_resv_iter cursor; > > > > > =C2=A0 struct dma_fence *fence; > > > >=20 > > > > And I really think here we should remove the open-coded and > > > > undocumented invalidation (It took me like 15 minutes to > > > > understand > > > > what the code did, and then I'm already up-to-date with how > > > > invalidation works). > > > >=20 > > >=20 > > > Yea, this ended up being more complicated that I though it would > > > be > > > and > > > took me several hours to get this (mostly) right. > > >=20 > > > > If just returning -EAGAIN doesn't really work here for the > > > > preempt- > > >=20 > > > I think return -EAGAIN should work. A quick hack got what appears > > > to > > > be > > > livelock on xe_vm.munmap-style-unbind-userptr-one-partial with > > > error > > > injection enabled. Need to dig in more, but my guess is this test > > > does > > > user bind which results in enough internal OPs so the error > > > injection > > > is > > > always hit on every user bind. If this is the case, we could > > > tweak > > > the > > > injection algorithm to use a counter based on user binds not > > > internal > > > OPs. > > >=20 > > > In practice, getting an invalidation between getting the CPU > > > pages > > > and > > > this step of the bind should be exceedingly rare, so return - > > > EAGAIN > > > doesn't seem like a huge deal. But this code path could really > > > help > > > for > > > SVM prefetches where perhaps a single page is touched by the CPU, > > > thus > > > aborting the entire prefetch. If we could just squash that part > > > of > > > the > > > prefetch into an invalidation, that could be useful. So weighing > > > the > > > complexity of an invalidation vs -EAGAIN needs to be considered. > > >=20 >=20 > Can you comment on your opinion if we should pursue this patch or go > the > -EAGAIN route /w updated error injection? I don't see other options > here. >=20 > > > > fence case, Then I suggest that the error injection intstead > > > > triggers a > > > > real invalidation outside of the notifier lock. > > >=20 > > > If the error injection really kicked the notifier, that would be > > > better. > > >=20 > > > > Perhaps using mmu_notifier_invalidate_range[start|end]()? > > > >=20 > > >=20 > > > Off the top of my head unsure how this would be implemented or if > > > it > > > is > > > possible. Generally a bad idea to call low level core MM > > > functions > > > though and I could see that upsetting the core MM people too if > > > they > > > make changes to whatever functions we need to call. > >=20 > > Good point. Maybe we could call our userptr notifier and update it > > so > > that the check for first bind is done after the list addition. > >=20 >=20 > I don't think we can call the notifier to mess with the seqno unless > we > enter through the core functions. Without a notifier seqno update, we > won't get into this code. >=20 > > >=20 > > > > But for now since this is a fix, restrict the -EINVAL inject to > > > > cases > > > > where it works and that we want to test. > > > >=20 > > >=20 > > > Not following this. > >=20 > > What I meant was Just test the -EINVAL error injection in the cases > > where aborting the whole operation is the expected result (IIRC > > that > > was with page-faulting). > >=20 >=20 > Still not following. I don't see how -EINVAL relates to the > xe_pt_userptr_inject_eagain code which is how test this code path. >=20 > Matt >=20 > > /Thomas > >=20 > >=20 > > >=20 > > > Matt > > >=20 > > > > Thanks, > > > >=20 > > > > /Thomas > > > >=20 > > > >=20 > > > > > @@ -1252,7 +1283,8 @@ static int vma_check_userptr(struct > > > > > xe_vm > > > > > *vm, > > > > > struct xe_vma *vma, > > > > > =C2=A0 return 0; > > > > > =C2=A0} > > > > > =C2=A0 > > > > > -static int op_check_userptr(struct xe_vm *vm, struct > > > > > xe_vma_op > > > > > *op, > > > > > +static int op_check_userptr(struct xe_tile *tile, struct > > > > > xe_vm > > > > > *vm, > > > > > + =C2=A0=C2=A0=C2=A0 struct xe_vma_op *op, > > > > > =C2=A0 =C2=A0=C2=A0=C2=A0 struct xe_vm_pgtable_update_ops > > > > > *pt_update) > > > > > =C2=A0{ > > > > > =C2=A0 int err =3D 0; > > > > > @@ -1264,18 +1296,21 @@ static int op_check_userptr(struct > > > > > xe_vm > > > > > *vm, > > > > > struct xe_vma_op *op, > > > > > =C2=A0 if (!op->map.immediate && > > > > > xe_vm_in_fault_mode(vm)) > > > > > =C2=A0 break; > > > > > =C2=A0 > > > > > - err =3D vma_check_userptr(vm, op->map.vma, > > > > > pt_update); > > > > > + err =3D vma_check_userptr(tile, vm, op- > > > > > >map.vma, > > > > > pt_update); > > > > > =C2=A0 break; > > > > > =C2=A0 case DRM_GPUVA_OP_REMAP: > > > > > =C2=A0 if (op->remap.prev) > > > > > - err =3D vma_check_userptr(vm, op- > > > > > > remap.prev, > > > > > pt_update); > > > > > + err =3D vma_check_userptr(tile, vm, > > > > > op- > > > > > > remap.prev, > > > > > + pt_update); > > > > > =C2=A0 if (!err && op->remap.next) > > > > > - err =3D vma_check_userptr(vm, op- > > > > > > remap.next, > > > > > pt_update); > > > > > + err =3D vma_check_userptr(tile, vm, > > > > > op- > > > > > > remap.next, > > > > > + pt_update); > > > > > =C2=A0 break; > > > > > =C2=A0 case DRM_GPUVA_OP_UNMAP: > > > > > =C2=A0 break; > > > > > =C2=A0 case DRM_GPUVA_OP_PREFETCH: > > > > > - err =3D vma_check_userptr(vm, gpuva_to_vma(op- > > > > > > base.prefetch.va), > > > > > + err =3D vma_check_userptr(tile, vm, > > > > > + gpuva_to_vma(op- > > > > > > base.prefetch.va), > > > > > =C2=A0 pt_update); > > > > > =C2=A0 break; > > > > > =C2=A0 default: > > > > > @@ -1301,7 +1336,8 @@ static int > > > > > xe_pt_userptr_pre_commit(struct > > > > > xe_migrate_pt_update *pt_update) > > > > > =C2=A0 down_read(&vm->userptr.notifier_lock); > > > > > =C2=A0 > > > > > =C2=A0 list_for_each_entry(op, &vops->list, link) { > > > > > - err =3D op_check_userptr(vm, op, > > > > > pt_update_ops); > > > > > + err =3D op_check_userptr(&vm->xe- > > > > > >tiles[pt_update- > > > > > > tile_id], > > > > > + =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 vm, op, > > > > > pt_update_ops); > > > > > =C2=A0 if (err) { > > > > > =C2=A0 up_read(&vm->userptr.notifier_lock); > > > > > =C2=A0 break; > > > > > diff --git a/drivers/gpu/drm/xe/xe_pt_types.h > > > > > b/drivers/gpu/drm/xe/xe_pt_types.h > > > > > index 384cc04de719..0f9b7650cd6f 100644 > > > > > --- a/drivers/gpu/drm/xe/xe_pt_types.h > > > > > +++ b/drivers/gpu/drm/xe/xe_pt_types.h > > > > > @@ -29,7 +29,6 @@ struct xe_pt { > > > > > =C2=A0 struct xe_bo *bo; > > > > > =C2=A0 unsigned int level; > > > > > =C2=A0 unsigned int num_live; > > > > > - bool rebind; > > > > > =C2=A0 bool is_compact; > > > > > =C2=A0#if IS_ENABLED(CONFIG_DRM_XE_DEBUG_VM) > > > > > =C2=A0 /** addr: Virtual address start address of the PT. > > > > > */ > > > > > diff --git a/drivers/gpu/drm/xe/xe_vm.c > > > > > b/drivers/gpu/drm/xe/xe_vm.c > > > > > index ea2e287e6526..f90e5c92010c 100644 > > > > > --- a/drivers/gpu/drm/xe/xe_vm.c > > > > > +++ b/drivers/gpu/drm/xe/xe_vm.c > > > > > @@ -623,8 +623,6 @@ static bool vma_userptr_invalidate(struct > > > > > mmu_interval_notifier *mni, > > > > > =C2=A0 spin_unlock(&vm->userptr.invalidated_lock); > > > > > =C2=A0 } > > > > > =C2=A0 > > > > > - up_write(&vm->userptr.notifier_lock); > > > > > - > > > > > =C2=A0 /* > > > > > =C2=A0 * Preempt fences turn into schedule disables, > > > > > pipeline > > > > > these. > > > > > =C2=A0 * Note that even in fault mode, we need to wait for > > > > > binds > > > > > and > > > > > @@ -647,6 +645,8 @@ static bool vma_userptr_invalidate(struct > > > > > mmu_interval_notifier *mni, > > > > > =C2=A0 XE_WARN_ON(err); > > > > > =C2=A0 } > > > > > =C2=A0 > > > > > + up_write(&vm->userptr.notifier_lock); > > > > > + > > > > > =C2=A0 trace_xe_vma_userptr_invalidate_complete(vma); > > > > > =C2=A0 > > > > > =C2=A0 return true; > > > >=20 > >=20