From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 151BBC52D6F for ; Wed, 21 Aug 2024 15:32:18 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B6BF110E41A; Wed, 21 Aug 2024 15:32:18 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="ZiTUfJ3s"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.14]) by gabe.freedesktop.org (Postfix) with ESMTPS id DEFA610E41A for ; Wed, 21 Aug 2024 15:32:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724254338; x=1755790338; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=1MLHJNFz8HUg7R37Ycjwj7DbXHKm8iTHC+3tZlHC8Ns=; b=ZiTUfJ3s9wEFwEUgLfSFsz/60NJpWWwLZ2ajx1smfoTHSeulQpbVaW8D mYGCkNujdUnWrLqlqcyqVLNos/Cm4+sN/Fuk++m+UWL8w+N1ghwWGXAor pRQubAmNDrGm0uYKpNx8jhUbmFzxldWQNZ4sFyDIWZVn5jpJZdne5GHKh 8pKNMNNPpp30+uaV4Cz14YKKQ4ANhgWeX/JglFs1gX9mEuic5EV3SqC// 2SHasy17kxgDDIReSJ7cXI4UbwWOESCPvDc0MTTs+dZF70UQOql/7ycGU faiLK5OQ5njYmfnx5/t/AZYhb9V3itWUuxBuM5QDBSWaEWzUKvBsePh31 Q==; X-CSE-ConnectionGUID: J/O2HQ+gQ+qC0uU89O0svg== X-CSE-MsgGUID: h/FAbULbQxyCP1kdB1dfnA== X-IronPort-AV: E=McAfee;i="6700,10204,11171"; a="22807517" X-IronPort-AV: E=Sophos;i="6.10,164,1719903600"; d="scan'208";a="22807517" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by fmvoesa108.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Aug 2024 08:32:18 -0700 X-CSE-ConnectionGUID: IRuRQ9usQgK2ywN9HzxsqA== X-CSE-MsgGUID: cwY2m1L/Tya4nwex3wNYtA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,164,1719903600"; d="scan'208";a="61260097" Received: from jorgera1-mobl.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.124.118.77]) by fmviesa010-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Aug 2024 08:32:17 -0700 Date: Wed, 21 Aug 2024 08:20:49 -0700 Message-ID: <875xrt9a26.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Matthew Brost Cc: intel-xe@lists.freedesktop.org, Jose Souza , Lionel Landwerlin , Umesh Nerlige Ramappa , Jonathan Cavitt Subject: Re: [PATCH 4/7] drm/xe/oa: Signal output fences In-Reply-To: References: <20240820005808.1412649-1-ashutosh.dixit@intel.com> <20240820005808.1412649-5-ashutosh.dixit@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.4 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Tue, 20 Aug 2024 12:23:13 -0700, Matthew Brost wrote: > Hi Matt, Thanks for all the feedback. Yes, my first encounter with dma_fence's so not everything is smooth yet. > On Mon, Aug 19, 2024 at 05:58:05PM -0700, Ashutosh Dixit wrote: > > Introduce 'struct xe_oa_fence' which includes the dma_fence used to signal > > output fences in the xe_sync array. The fences are signaled > > asynchronously. When there are no output fences to signal, the OA > > configuration wait is synchronously re-introduced into the ioctl. > > > > v2: Don't wait in the work, use callback + delayed work (Matt B) > > Use a single, not a per-fence spinlock (Matt Brost) > > > > Suggested-by: Matthew Brost > > Signed-off-by: Ashutosh Dixit > > --- > > drivers/gpu/drm/xe/xe_oa.c | 110 +++++++++++++++++++++++++++---- > > drivers/gpu/drm/xe/xe_oa_types.h | 3 + > > 2 files changed, 100 insertions(+), 13 deletions(-) > > > > diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c > > index cad8f54500a10..1478d88722170 100644 > > --- a/drivers/gpu/drm/xe/xe_oa.c > > +++ b/drivers/gpu/drm/xe/xe_oa.c > > @@ -100,6 +100,15 @@ struct xe_oa_config_bo { > > struct xe_bb *bb; > > }; > > > > +struct xe_oa_fence { > > + /* @base: dma fence base */ > > + struct dma_fence base; > > + /* @work: work to signal @base */ > > + struct delayed_work work; > > + /* @cb: callback to schedule @work */ > > + struct dma_fence_cb cb; > > +}; > > + > > #define DRM_FMT(x) DRM_XE_OA_FMT_TYPE_##x > > > > static const struct xe_oa_format oa_formats[] = { > > @@ -945,13 +954,62 @@ xe_oa_alloc_config_buffer(struct xe_oa_stream *stream, struct xe_oa_config *oa_c > > return oa_bo; > > } > > > > +static void xe_oa_fence_work_fn(struct work_struct *w) > > +{ > > + struct xe_oa_fence *ofence = container_of(w, typeof(*ofence), work.work); > > + > > + /* Signal fence to indicate new OA configuration is active */ > > + dma_fence_signal(&ofence->base); > > + dma_fence_put(&ofence->base); > > +} > > + > > +static void xe_oa_config_cb(struct dma_fence *fence, struct dma_fence_cb *cb) > > +{ > > + /* Additional empirical delay needed for NOA programming after registers are written */ > > +#define NOA_PROGRAM_ADDITIONAL_DELAY_US 500 > > + > > + struct xe_oa_fence *ofence = container_of(cb, typeof(*ofence), cb); > > + > > + INIT_DELAYED_WORK(&ofence->work, xe_oa_fence_work_fn); > > + queue_delayed_work(system_unbound_wq, &ofence->work, > > + usecs_to_jiffies(NOA_PROGRAM_ADDITIONAL_DELAY_US)); > > + dma_fence_put(fence); > > +} > > + > > +static const char *xe_oa_get_driver_name(struct dma_fence *fence) > > +{ > > + return "xe_oa"; > > +} > > + > > +static const char *xe_oa_get_timeline_name(struct dma_fence *fence) > > +{ > > + return "unbound"; > > +} > > + > > +static const struct dma_fence_ops xe_oa_fence_ops = { > > + .get_driver_name = xe_oa_get_driver_name, > > + .get_timeline_name = xe_oa_get_timeline_name, > > +}; > > + > > +static struct xe_oa_fence *xe_oa_fence_arm(struct xe_oa_stream *stream) > > +{ > > + struct xe_oa_fence *ofence; > > + > > + ofence = kzalloc(sizeof(*ofence), GFP_KERNEL); > > + if (!ofence) > > + return ERR_PTR(-ENOMEM); > > I'd split this out so the malloc is done before submitting the job and > done dma_fence_init after. This way once the job submitted there are no > failure points. OK, done in v3, definitely simplifies code flow. > Also doing malloc after a job is submitted plays into dma-fence rules > too, you have malloc in the path a signaling a user dma-fence too. I believe you are referring to this: * * Drivers are allowed to call dma_fence_wait() from their &shrinker * callbacks. This means any code required for fence completion cannot * allocate memory with GFP_KERNEL. > It probably works the way you have it, but best practices we to be follow > the changes I suggest. Agreed. > > > + > > + dma_fence_init(&ofence->base, &xe_oa_fence_ops, &stream->oa_fence_lock, 0, 0); > > + return ofence; > > +} > > + > > static int xe_oa_emit_oa_config(struct xe_oa_stream *stream, struct xe_oa_config *config) > > { > > #define NOA_PROGRAM_ADDITIONAL_DELAY_US 500 > > struct xe_oa_config_bo *oa_bo; > > - int err = 0, us = NOA_PROGRAM_ADDITIONAL_DELAY_US; > > + struct xe_oa_fence *ofence; > > + int i, err, num_signal = 0; > > struct dma_fence *fence; > > - long timeout; > > > > /* Emit OA configuration batch */ > > oa_bo = xe_oa_alloc_config_buffer(stream, config); > > @@ -966,18 +1024,43 @@ static int xe_oa_emit_oa_config(struct xe_oa_stream *stream, struct xe_oa_config > > goto exit; > > } > > > > - /* Wait till all previous batches have executed */ > > - timeout = dma_fence_wait_timeout(fence, false, 5 * HZ); > > - dma_fence_put(fence); > > - if (timeout < 0) > > - err = timeout; > > - else if (!timeout) > > - err = -ETIME; > > - if (err) > > - drm_dbg(&stream->oa->xe->drm, "dma_fence_wait_timeout err %d\n", err); > > + /* Initialize and set fence to signal */ > > + ofence = xe_oa_fence_arm(stream); > > + if (IS_ERR(ofence)) { > > + err = PTR_ERR(ofence); > > + goto put_fence; > > + } > > > > - /* Additional empirical delay needed for NOA programming after registers are written */ > > - usleep_range(us, 2 * us); > > + for (i = 0; i < stream->num_syncs; i++) { > > + if (stream->syncs[i].flags & DRM_XE_SYNC_FLAG_SIGNAL) > > + num_signal++; > > + xe_sync_entry_signal(&stream->syncs[i], &ofence->base); > > + } > > + > > + /* Add job fence callback to schedule work to signal ofence->base */ > > + err = dma_fence_add_callback(fence, &ofence->cb, xe_oa_config_cb); > > + if (err == -ENOENT) > > + xe_oa_config_cb(fence, &ofence->cb); > > + else if (err) > > I'd just assert here rather than fail. The only return currently from > dma_fence_add_callback is -ENOENT, in other code paths we just assert > too. See invalidation_fence_init in xe_pt.c. Done in v3. > > > + goto put_ofence; > > + > > + /* If nothing needs to be signaled we wait synchronously */ > > + if (!num_signal) > > + dma_fence_wait(&ofence->base, true); > > I think you have a UAF here. The worker which signals the fence puts > '&ofence->base'. So I think you need an extra ref for !num_signal before > calling dma_fence_add_callback which is dropped after dma_fence_wait. Good catch, but I think it is safe because the fence cannot really be freed till the xe_sync_entry_cleanup() below (even if userspace closes the sync handle). But in any case, to be safe, I've added an additional dma_fence_get/put. > Also since you have interruptable wait here, you likely need to return > an error to the user to retry the IOCTL upon interruption, right? Good catch, I've changed that to a non-interruptible wait (didn't really need an interruptible wait, was just jittery about hangs i.e. fence not signaling at all :/ ) Thanks a lot, Ashutosh > > > + > > + /* Done with syncs */ > > + for (i = 0; i < stream->num_syncs; i++) > > + xe_sync_entry_cleanup(&stream->syncs[i]); > > + kfree(stream->syncs); > > + > > + return 0; > > +put_ofence: > > + for (i = 0; i < stream->num_syncs; i++) > > + xe_sync_entry_cleanup(&stream->syncs[i]); > > + kfree(stream->syncs); > > + dma_fence_put(&ofence->base); > > +put_fence: > > + dma_fence_put(fence); > > exit: > > return err; > > } > > @@ -1480,6 +1563,7 @@ static int xe_oa_stream_init(struct xe_oa_stream *stream, > > goto err_free_oa_buf; > > } > > > > + spin_lock_init(&stream->oa_fence_lock); > > ret = xe_oa_enable_metric_set(stream); > > if (ret) { > > drm_dbg(&stream->oa->xe->drm, "Unable to enable metric set\n"); > > diff --git a/drivers/gpu/drm/xe/xe_oa_types.h b/drivers/gpu/drm/xe/xe_oa_types.h > > index c1ca960af9305..412f1460c1437 100644 > > --- a/drivers/gpu/drm/xe/xe_oa_types.h > > +++ b/drivers/gpu/drm/xe/xe_oa_types.h > > @@ -239,6 +239,9 @@ struct xe_oa_stream { > > /** @no_preempt: Whether preemption and timeslicing is disabled for stream exec_q */ > > u32 no_preempt; > > > > + /** @oa_fence_lock: Lock for struct xe_oa_fence */ > > + spinlock_t oa_fence_lock; > > + > > /** @num_syncs: size of @syncs array */ > > u32 num_syncs; > > > > -- > > 2.41.0 > >