From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 43661EB64D9 for ; Sun, 2 Jul 2023 21:14:08 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C8B5410E011; Sun, 2 Jul 2023 21:14:07 +0000 (UTC) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTPS id 024F410E011 for ; Sun, 2 Jul 2023 21:14:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1688332446; x=1719868446; h=message-id:subject:from:to:cc:date:in-reply-to: references:content-transfer-encoding:mime-version; bh=T3AnzwZhIFwKWJPU6kiiq7Sx68T/V3NCnSXOO0yJZbw=; b=NGzHm5VZyy/KTJBtD+IjLsW9mNjcTNUji31XFSXJlrjTs5JaXsWenoe1 FkF+z1E93a5rs1fFkKtxmKc9pN1d6tg4K2EqwzqtDrOSWFwd6ThQYtdEM LOi8/E0NkwQROgymFChVnN5RfPuUGJ2fcMXAlwJJ1aGXTZgk8cJV9PsvN X9NgErIeApJYar0pMV4LiLJn1KMILusKdkZgfRtRqrfwgiwISeSVeRcna Uw6W51j/oQ/+VwExHmM6TzGxa/m8OVYIxRvNHONb9Xri9Ws8pJBmyw7IF ff3UqAiZP+rdRuZUzekMYljOcPEKZWz7JwONYGNWG2XOCrS9NmrEwbq+g g==; X-IronPort-AV: E=McAfee;i="6600,9927,10759"; a="361598008" X-IronPort-AV: E=Sophos;i="6.01,176,1684825200"; d="scan'208";a="361598008" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jul 2023 14:14:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10759"; a="695596620" X-IronPort-AV: E=Sophos;i="6.01,176,1684825200"; d="scan'208";a="695596620" Received: from shari19x-mobl1.gar.corp.intel.com (HELO [10.249.254.207]) ([10.249.254.207]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jul 2023 14:14:02 -0700 Message-ID: <197d3d5ce233a42884e75e0a743c9faad33639e1.camel@linux.intel.com> From: Thomas =?ISO-8859-1?Q?Hellstr=F6m?= To: Matthew Brost Date: Sun, 02 Jul 2023 23:13:43 +0200 In-Reply-To: References: <20230629205134.111849-1-thomas.hellstrom@linux.intel.com> <20230629205134.111849-3-thomas.hellstrom@linux.intel.com> Organization: Intel Sweden AB, Registration Number: 556189-6027 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.4 (3.46.4-1.fc37) MIME-Version: 1.0 Subject: Re: [Intel-xe] [PATCH 2/2] drm/xe: Fix the separate bind-engine race using coarse-granularity dependencies X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-xe@lists.freedesktop.org Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Sat, 2023-07-01 at 04:21 +0000, Matthew Brost wrote: > On Thu, Jun 29, 2023 at 10:51:34PM +0200, Thomas Hellstr=C3=B6m wrote: > > Separate bind-engines operating on the same VM range might race > > updating > > page-tables. To make sure that doesn't happen, each page-table > > update > > operation needs to collect internal dependencies to await before > > the > > job is executed. > >=20 > > Provide an infrastructure to do that. Initially we save a single > > dma-fence > > for the entire VM, which thus removes the benefit of separate bind- > > engines > > in favour of fixing the race, but a more fine-grained dependency > > tracking can be achieved by using, for example, the same method as > > the > > i915 vma_resources (an interval tree storing unsignaled fences). > > That > > of course comes with increasing code complexity. > >=20 > > This patch will break the xe_vm@bind-engines-independent igt test, > > but that > > would need an update anyway to avoid the independent binds using > > the > > same address range. In any case, such a test would not work with > > the > > initial xe implementation unless the binds were using different > > vms. > >=20 >=20 > We need to do better than this as this makes bind engines useless as > everything is serialized. Yes, agreed, and as mentioned in the commit message this fixes the bug and provides an infrastructure to do a better follow-up. Note that a client can never *rely* on bind-engines executing separately since they use common resources that may become restricted, even if they would typically execute separately. >=20 > Hmm, how about a mtree where we store fences for un/bind jobs with > the > key being the highest level in which the tree is pruned or unpruned? >=20 > Let's do an example on an empty tree with 48 bits of VA /w 4k pages >=20 > - Bind 0x0000 to 0x1000 <- Inserts mtree entry with key of 0x0 -> > (0x1 << 39), fence A >=20 > - Bind 0x1000 to 0x2000 <- Waits on fence as lookup find fence A, no > new > =C2=A0 fence inserted as the only entry inserted was a level 0 leaf >=20 > - Bind (0x1 << 39) to (0x1 << 39) + 0x1000 <- no need to wait on > fence A > =C2=A0 as lookup fails, insert new fence B with key (0x1 << 39) -> (0x2 <= < > 39) >=20 > - Unbind 0x1000 to 0x2000 <- no need to wait on fence A as lookup > fails, > =C2=A0 no new fence inserted as the only entry removed was a level 0 leaf >=20 > - Unbind 0x0000 to 0x1000 <- Waits on fence as lookup find fence A, > =C2=A0 insert fence C with key of 0x0 -> (0x1 << 39) >=20 > I think this would be fairly simple to implement. The GPUVA series > has > examples of how to implement mtrees with range keys [1]. >=20 > One thing more thing is how to cleanup the mtree fences, I think a > garage collector which traverses mtree every so often which removes > signaled fences should work just fine. >=20 > What do you think? Crazy idea or does it seem reasonable? If it is > the > later, This is more or less exactly what the commit message suggests, is done for i915 vma resources handling, except the latter uses an overlapping interval tree (map / unmap ranges would overlap which I figure makes it impossible to use an mtree?) Did you have a chance to look at the vma resources implementation? The fences in the interval tree there are cleaned up using fence signalling callbacks. > lets talk on who should code this up. I had a plan to do that as a follow-up patch. IMO the functionality of this patch is good for a bugfix, and can be built upon for a complete solution. Separate execution of bind engines is a (probably important) optimization, but I think at this point priority must be in fixing the bug. /Thomas =20 >=20 > Lastly, I have IGTs to expose these races, [2], [3], I think the IGTs > should work after these changes. >=20 > Matt >=20 > [1] > https://patchwork.freedesktop.org/patch/544863/?series=3D120000&rev=3D3 > [2] > https://gitlab.freedesktop.org/drm/xe/igt-gpu-tools/-/merge_requests/13/d= iffs?commit_id=3D2de056f6e9213a804f8b0489bbd91b989834d158 > [3] > https://gitlab.freedesktop.org/drm/xe/igt-gpu-tools/-/merge_requests/13/d= iffs?commit_id=3D23ea98fce7523b2aa252f4fe19411f5591a5623b >=20 > > Signed-off-by: Thomas Hellstr=C3=B6m > > --- > > =C2=A0drivers/gpu/drm/xe/xe_migrate.c=C2=A0 |=C2=A0 2 ++ > > =C2=A0drivers/gpu/drm/xe/xe_migrate.h=C2=A0 |=C2=A0 2 ++ > > =C2=A0drivers/gpu/drm/xe/xe_pt.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | = 48 > > ++++++++++++++++++++++++++++++++ > > =C2=A0drivers/gpu/drm/xe/xe_vm.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |= =C2=A0 1 + > > =C2=A0drivers/gpu/drm/xe/xe_vm_types.h |=C2=A0 8 ++++++ > > =C2=A05 files changed, 61 insertions(+) > >=20 > > diff --git a/drivers/gpu/drm/xe/xe_migrate.c > > b/drivers/gpu/drm/xe/xe_migrate.c > > index 41c90f6710ee..ff0a422f59a5 100644 > > --- a/drivers/gpu/drm/xe/xe_migrate.c > > +++ b/drivers/gpu/drm/xe/xe_migrate.c > > @@ -1073,6 +1073,7 @@ xe_migrate_update_pgtables_cpu(struct > > xe_migrate *m, > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0return ERR_PTR(-ETIME); > > =C2=A0 > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (ops->pre_commit) { > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0pt_update->job =3D NULL; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0err =3D ops->pre_commit(pt_update); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0if (err) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= return ERR_PTR(err); > > @@ -1294,6 +1295,7 @@ xe_migrate_update_pgtables(struct xe_migrate > > *m, > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0goto err_job; > > =C2=A0 > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (ops->pre_commit) { > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0pt_update->job =3D job; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0err =3D ops->pre_commit(pt_update); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0if (err) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= goto err_job; > > diff --git a/drivers/gpu/drm/xe/xe_migrate.h > > b/drivers/gpu/drm/xe/xe_migrate.h > > index 204337ea3b4e..b4135876e3f7 100644 > > --- a/drivers/gpu/drm/xe/xe_migrate.h > > +++ b/drivers/gpu/drm/xe/xe_migrate.h > > @@ -69,6 +69,8 @@ struct xe_migrate_pt_update { > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0const struct xe_migrate= _pt_update_ops *ops; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/** @vma: The vma we're= updating the pagetable for. */ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct xe_vma *vma; > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/** @job: The job if a GPU p= age-table update. NULL > > otherwise */ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct xe_sched_job *job; > > =C2=A0}; > > =C2=A0 > > =C2=A0struct xe_migrate *xe_migrate_init(struct xe_tile *tile); > > diff --git a/drivers/gpu/drm/xe/xe_pt.c > > b/drivers/gpu/drm/xe/xe_pt.c > > index fe1c77b139e4..f38e7b5a3b32 100644 > > --- a/drivers/gpu/drm/xe/xe_pt.c > > +++ b/drivers/gpu/drm/xe/xe_pt.c > > @@ -1119,6 +1119,42 @@ struct xe_pt_migrate_pt_update { > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0bool locked; > > =C2=A0}; > > =C2=A0 > > +/* > > + * This function adds the needed dependencies to a page-table > > update job > > + * to make sure racing jobs for separate bind engines don't race > > writing > > + * to the same page-table range, wreaking havoc. Initially use a > > single > > + * fence for the entire VM. An optimization would use smaller > > granularity. > > + */ > > +static int xe_pt_vm_dependencies(struct xe_sched_job *job, struct > > xe_vm *vm) > > +{ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0int err; > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (!vm->last_update_fence) > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0return 0; > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (dma_fence_is_signaled(vm= ->last_update_fence)) { > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0dma_fence_put(vm->last_update_fence); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0vm->last_update_fence =3D NULL; > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0return 0; > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* Is this a CPU update? GPU= is busy updating, so return an > > error */ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (!job) > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0return -ETIME; > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0dma_fence_get(vm->last_updat= e_fence); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0err =3D drm_sched_job_add_de= pendency(&job->drm, vm- > > >last_update_fence); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (err) > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0dma_fence_put(vm->last_update_fence); > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return err; > > +} > > + > > +static int xe_pt_pre_commit(struct xe_migrate_pt_update > > *pt_update) > > +{ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return xe_pt_vm_dependencies= (pt_update->job, pt_update- > > >vma->vm); > > +} > > + > > =C2=A0static int xe_pt_userptr_pre_commit(struct xe_migrate_pt_update > > *pt_update) > > =C2=A0{ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct xe_pt_migrate_pt= _update *userptr_update =3D > > @@ -1126,6 +1162,10 @@ static int xe_pt_userptr_pre_commit(struct > > xe_migrate_pt_update *pt_update) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct xe_vma *vma =3D = pt_update->vma; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0unsigned long notifier_= seq =3D vma->userptr.notifier_seq; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct xe_vm *vm =3D vm= a->vm; > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0int err =3D xe_pt_vm_depende= ncies(pt_update->job, vm); > > + > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (err) > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0return err; > > =C2=A0 > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0userptr_update->locked = =3D false; > > =C2=A0 > > @@ -1164,6 +1204,7 @@ static int xe_pt_userptr_pre_commit(struct > > xe_migrate_pt_update *pt_update) > > =C2=A0 > > =C2=A0static const struct xe_migrate_pt_update_ops bind_ops =3D { > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0.populate =3D xe_vm_pop= ulate_pgtable, > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0.pre_commit =3D xe_pt_pre_co= mmit, > > =C2=A0}; > > =C2=A0 > > =C2=A0static const struct xe_migrate_pt_update_ops userptr_bind_ops =3D= { > > @@ -1345,6 +1386,9 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct > > xe_vma *vma, struct xe_engine *e, > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (!IS_ERR(fence)) { > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0LLIST_HEAD(deferred); > > =C2=A0 > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0dma_fence_put(vm->last_update_fence); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0vm->last_update_fence =3D dma_fence_get(fence); > > + > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0/* TLB invalidation must be done before signalin= g > > rebind */ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0if (ifence) { > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= int err =3D invalidation_fence_init(tile- > > >primary_gt, ifence, fence, > > @@ -1591,6 +1635,7 @@ xe_pt_commit_unbind(struct xe_vma *vma, > > =C2=A0 > > =C2=A0static const struct xe_migrate_pt_update_ops unbind_ops =3D { > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0.populate =3D xe_migrat= e_clear_pgtable_callback, > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0.pre_commit =3D xe_pt_pre_co= mmit, > > =C2=A0}; > > =C2=A0 > > =C2=A0static const struct xe_migrate_pt_update_ops userptr_unbind_ops = =3D > > { > > @@ -1666,6 +1711,9 @@ __xe_pt_unbind_vma(struct xe_tile *tile, > > struct xe_vma *vma, struct xe_engine *e > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if (!IS_ERR(fence)) { > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0int err; > > =C2=A0 > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0dma_fence_put(vm->last_update_fence); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0vm->last_update_fence =3D dma_fence_get(fence); > > + > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0/* TLB invalidation must be done before signalin= g > > unbind */ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0err =3D invalidation_fence_init(tile->primary_gt= , > > ifence, fence, vma); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0if (err) { > > diff --git a/drivers/gpu/drm/xe/xe_vm.c > > b/drivers/gpu/drm/xe/xe_vm.c > > index 8b8c9c5aeb01..f90f3a7c6ede 100644 > > --- a/drivers/gpu/drm/xe/xe_vm.c > > +++ b/drivers/gpu/drm/xe/xe_vm.c > > @@ -1517,6 +1517,7 @@ static void vm_destroy_work_func(struct > > work_struct *w) > > =C2=A0 > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0trace_xe_vm_free(vm); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0dma_fence_put(vm->rebin= d_fence); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0dma_fence_put(vm->last_updat= e_fence); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0dma_resv_fini(&vm->resv= ); > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0kfree(vm); > > =C2=A0} > > diff --git a/drivers/gpu/drm/xe/xe_vm_types.h > > b/drivers/gpu/drm/xe/xe_vm_types.h > > index c148dd49a6ca..5d9eebe5c6bb 100644 > > --- a/drivers/gpu/drm/xe/xe_vm_types.h > > +++ b/drivers/gpu/drm/xe/xe_vm_types.h > > @@ -343,6 +343,14 @@ struct xe_vm { > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0bool capture_once; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} error_capture; > > =C2=A0 > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/** > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * @last_update_fence: fence= representing the last page- > > table > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * update on this VM. Used t= o avoid races between separate > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * bind engines. Ideally thi= s should be an interval tree of > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 * unsignaled fences. Protec= ted by the vm resv. > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 */ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct dma_fence *last_updat= e_fence; > > + > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/** @batch_invalidate_t= lb: Always invalidate TLB before > > batch start */ > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0bool batch_invalidate_t= lb; > > =C2=A0}; > > --=20 > > 2.40.1 > >=20