From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4E025C52D7C for ; Tue, 20 Aug 2024 01:11:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1DA7910E324; Tue, 20 Aug 2024 01:11:52 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="J+Ms+BNB"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) by gabe.freedesktop.org (Postfix) with ESMTPS id 056CF10E324 for ; Tue, 20 Aug 2024 01:11:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724116311; x=1755652311; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=/F7kEkydNCBT+vFr87OtxkND0KFLfnq5L7V9x2VLST4=; b=J+Ms+BNBkm/HuW8hq2fhG9pTb0sIhTUTfEqJcSX4uAie7McezsQyc+oG 44d9lXB2hXk3GNU4BjyLy3LpD6QYFqgy5oFQFJqUjRxh5YDCue22WzVZS 0zoz20f+/cwkx5vg3UHXHGzml3ZdHxy57ViA97lmfQ8QeC3UjTdeHbKYR i56Ur+6jPtZYroTqsKacNlXXhtUONftQ8N5KGuSJbBv3JlzfOucsD1PXL AmWqMmH7DJK7oOqWyJZ3VF4ukLeXA9HKv2HkmAwoRaVGsdfVlqZTHO8nP iaMR43/mipfIKcYAmh7ItDzu4+KKjBkABiuAgluGvCfUrHxqGtTANyZKV w==; X-CSE-ConnectionGUID: A8KqdsY5QmqpnhuJez4FCA== X-CSE-MsgGUID: ELeWVkwOROKyrqd+DzmhQw== X-IronPort-AV: E=McAfee;i="6700,10204,11169"; a="47786038" X-IronPort-AV: E=Sophos;i="6.10,160,1719903600"; d="scan'208";a="47786038" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2024 18:11:50 -0700 X-CSE-ConnectionGUID: I2zzAKEnQ2C2J4OVhz7M7A== X-CSE-MsgGUID: QkpalpF5TX6LON3Lyhn02A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,160,1719903600"; d="scan'208";a="60202863" Received: from peterval-mobl1.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.124.114.37]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2024 18:11:50 -0700 Date: Mon, 19 Aug 2024 18:01:11 -0700 Message-ID: <87o75odn3c.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: "Cavitt, Jonathan" Cc: "intel-xe@lists.freedesktop.org" , "Nerlige Ramappa, Umesh" , "Souza, Jose" , "Landwerlin, Lionel G" Subject: Re: [PATCH 1/8] drm/xe/oa: Separate batch submission from waiting for completion In-Reply-To: References: <20240808174139.4027534-1-ashutosh.dixit@intel.com> <20240808174139.4027534-2-ashutosh.dixit@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.4 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Thu, 08 Aug 2024 13:18:21 -0700, Cavitt, Jonathan wrote: > Hi Jonathan, > -----Original Message----- > From: Intel-xe On Behalf Of Ashutosh Dixit > Sent: Thursday, August 8, 2024 10:42 AM > To: intel-xe@lists.freedesktop.org > Cc: Nerlige Ramappa, Umesh ; Souza, Jose ; Landwerlin, Lionel G > Subject: [PATCH 1/8] drm/xe/oa: Separate batch submission from waiting for completion > > > > When we introduce xe_syncs, we don't wait for internal OA programming > > batches to complete. That is, xe_syncs are signaled asynchronously. In > > anticipation for this, separate out batch submission from waiting for > > completion of those batches. > > > > Signed-off-by: Ashutosh Dixit > > The added goto err in xe_oa_emit_oa_config had me briefly > worried due to the fence variable not getting freed, but it > seems the new organization of the series never sets the > fence variable in the case of an error. The patch still had a bug. fence should be initialized to NULL for xe_bb_free() and dma_fence_put() to work if xe_oa_submit_bb() returns error. Hopefully fixed in v2 where the patch is reworked along lines suggested by Matt. > So, LGTM. > Reviewed-by: Jonathan Cavitt Thanks. -- Ashutosh > -Jonathan Cavitt > > --- > > drivers/gpu/drm/xe/xe_oa.c | 45 ++++++++++++++++++++++++-------------- > > 1 file changed, 28 insertions(+), 17 deletions(-) > > > > diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c > > index 3ef92eb8fbb1e..d842c801fb9f1 100644 > > --- a/drivers/gpu/drm/xe/xe_oa.c > > +++ b/drivers/gpu/drm/xe/xe_oa.c > > @@ -563,11 +563,10 @@ static __poll_t xe_oa_poll(struct file *file, poll_table *wait) > > return ret; > > } > > > > -static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) > > +static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb, > > + struct dma_fence **fence) > > { > > struct xe_sched_job *job; > > - struct dma_fence *fence; > > - long timeout; > > int err = 0; > > > > /* Kernel configuration is issued on stream->k_exec_q, not stream->exec_q */ > > @@ -578,15 +577,8 @@ static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) > > } > > > > xe_sched_job_arm(job); > > - fence = dma_fence_get(&job->drm.s_fence->finished); > > + *fence = dma_fence_get(&job->drm.s_fence->finished); > > xe_sched_job_push(job); > > - > > - timeout = dma_fence_wait_timeout(fence, false, HZ); > > - dma_fence_put(fence); > > - if (timeout < 0) > > - err = timeout; > > - else if (!timeout) > > - err = -ETIME; > > exit: > > return err; > > } > > @@ -652,6 +644,7 @@ static void xe_oa_store_flex(struct xe_oa_stream *stream, struct xe_lrc *lrc, > > static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lrc, > > const struct flex *flex, u32 count) > > { > > + struct dma_fence *fence; > > struct xe_bb *bb; > > int err; > > > > @@ -663,14 +656,16 @@ static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lr > > > > xe_oa_store_flex(stream, lrc, bb, flex, count); > > > > - err = xe_oa_submit_bb(stream, bb); > > - xe_bb_free(bb, NULL); > > + err = xe_oa_submit_bb(stream, bb, &fence); > > + xe_bb_free(bb, fence); > > + dma_fence_put(fence); > > exit: > > return err; > > } > > > > static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri) > > { > > + struct dma_fence *fence; > > struct xe_bb *bb; > > int err; > > > > @@ -682,8 +677,9 @@ static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *re > > > > write_cs_mi_lri(bb, reg_lri, 1); > > > > - err = xe_oa_submit_bb(stream, bb); > > - xe_bb_free(bb, NULL); > > + err = xe_oa_submit_bb(stream, bb, &fence); > > + xe_bb_free(bb, fence); > > + dma_fence_put(fence); > > exit: > > return err; > > } > > @@ -913,15 +909,30 @@ static int xe_oa_emit_oa_config(struct xe_oa_stream *stream, struct xe_oa_config > > { > > #define NOA_PROGRAM_ADDITIONAL_DELAY_US 500 > > struct xe_oa_config_bo *oa_bo; > > - int err, us = NOA_PROGRAM_ADDITIONAL_DELAY_US; > > + int err = 0, us = NOA_PROGRAM_ADDITIONAL_DELAY_US; > > + struct dma_fence *fence; > > + long timeout; > > > > + /* Emit OA configuration batch */ > > oa_bo = xe_oa_alloc_config_buffer(stream, config); > > if (IS_ERR(oa_bo)) { > > err = PTR_ERR(oa_bo); > > goto exit; > > } > > > > - err = xe_oa_submit_bb(stream, oa_bo->bb); > > + err = xe_oa_submit_bb(stream, oa_bo->bb, &fence); > > + if (err) > > + goto exit; > > + > > + /* Wait till all previous batches have executed */ > > + timeout = dma_fence_wait_timeout(fence, false, 5 * HZ); > > + dma_fence_put(fence); > > + if (timeout < 0) > > + err = timeout; > > + else if (!timeout) > > + err = -ETIME; > > + if (err) > > + drm_dbg(&stream->oa->xe->drm, "dma_fence_wait_timeout err %d\n", err); > > > > /* Additional empirical delay needed for NOA programming after registers are written */ > > usleep_range(us, 2 * us); > > -- > > 2.41.0 > > > >