From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B3256C3DA4A for ; Tue, 20 Aug 2024 01:10:40 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 47F8D10E322; Tue, 20 Aug 2024 01:10:40 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="eyRcQrxn"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) by gabe.freedesktop.org (Postfix) with ESMTPS id C9D8610E322 for ; Tue, 20 Aug 2024 01:10:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724116240; x=1755652240; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=XGTJRtHa6aMp34HUgWa4wOUbNkP1YWCVpj/HmUda7Xo=; b=eyRcQrxnZECNuyYJMCv9yZDXgY+24JoOjsEnf8DYzDQ+YL8/t0R2c6zb E9D0MA8r8cJhv2QqutHQGQ5lBFEXm0sLTFSfxa+sHFOGWdP2fuxsR1ao0 0TBFmDCUnJ3GE8OMQauRwCQoNCOmz7b5GfBziRzu76FBe9Q8RWlpkMoSW mFrNEgzkotQ7a8mhuorvb0w3kAeQRiy++3OEkBoYCedx9Cotje61SKFDH KJf85kx8qGtujRI8am782N7jAamYwPhn78mxO8XiUcmHDTephEOdS3B6b BJheOlR5tca3D6EmKkdfysE3cUvQUhDXD8KeMv+zdQK3DZkHDIShrFPOK g==; X-CSE-ConnectionGUID: 3y5uoXXoQI+mTLlgTPQJ0A== X-CSE-MsgGUID: +8DyFqGgT7qmHnmdRJWdVg== X-IronPort-AV: E=McAfee;i="6700,10204,11169"; a="26146527" X-IronPort-AV: E=Sophos;i="6.10,160,1719903600"; d="scan'208";a="26146527" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2024 18:10:31 -0700 X-CSE-ConnectionGUID: LayGH4PFSrCk/QgG4FAZuQ== X-CSE-MsgGUID: K//V/gpXQDeex01UfMT/uw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,160,1719903600"; d="scan'208";a="60521036" Received: from peterval-mobl1.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.124.114.37]) by orviesa009-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Aug 2024 18:10:31 -0700 Date: Mon, 19 Aug 2024 18:00:39 -0700 Message-ID: <87ikvw80ug.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Matthew Brost Cc: intel-xe@lists.freedesktop.org Subject: Re: [PATCH 1/8] drm/xe/oa: Separate batch submission from waiting for completion In-Reply-To: References: <20240808174139.4027534-1-ashutosh.dixit@intel.com> <20240808174139.4027534-2-ashutosh.dixit@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.4 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Thu, 08 Aug 2024 16:04:10 -0700, Matthew Brost wrote: > > On Thu, Aug 08, 2024 at 10:41:32AM -0700, Ashutosh Dixit wrote: > > When we introduce xe_syncs, we don't wait for internal OA programming > > batches to complete. That is, xe_syncs are signaled asynchronously. In > > anticipation for this, separate out batch submission from waiting for > > completion of those batches. > > > > Signed-off-by: Ashutosh Dixit > > --- > > drivers/gpu/drm/xe/xe_oa.c | 45 ++++++++++++++++++++++++-------------- > > 1 file changed, 28 insertions(+), 17 deletions(-) > > > > diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c > > index 3ef92eb8fbb1e..d842c801fb9f1 100644 > > --- a/drivers/gpu/drm/xe/xe_oa.c > > +++ b/drivers/gpu/drm/xe/xe_oa.c > > @@ -563,11 +563,10 @@ static __poll_t xe_oa_poll(struct file *file, poll_table *wait) > > return ret; > > } > > > > -static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) > > +static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb, > > + struct dma_fence **fence) > > static struct dma_fence *xe_oa_submit_bb(...) > > Then use ERR_PTR, PTR_ERR semantics. Done in v2. > > Matt > > > { > > struct xe_sched_job *job; > > - struct dma_fence *fence; > > - long timeout; > > int err = 0; > > > > /* Kernel configuration is issued on stream->k_exec_q, not stream->exec_q */ > > @@ -578,15 +577,8 @@ static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) > > } > > > > xe_sched_job_arm(job); > > - fence = dma_fence_get(&job->drm.s_fence->finished); > > + *fence = dma_fence_get(&job->drm.s_fence->finished); > > xe_sched_job_push(job); > > - > > - timeout = dma_fence_wait_timeout(fence, false, HZ); > > - dma_fence_put(fence); > > - if (timeout < 0) > > - err = timeout; > > - else if (!timeout) > > - err = -ETIME; > > exit: > > return err; > > } > > @@ -652,6 +644,7 @@ static void xe_oa_store_flex(struct xe_oa_stream *stream, struct xe_lrc *lrc, > > static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lrc, > > const struct flex *flex, u32 count) > > { > > + struct dma_fence *fence; > > struct xe_bb *bb; > > int err; > > > > @@ -663,14 +656,16 @@ static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lr > > > > xe_oa_store_flex(stream, lrc, bb, flex, count); > > > > - err = xe_oa_submit_bb(stream, bb); > > - xe_bb_free(bb, NULL); > > + err = xe_oa_submit_bb(stream, bb, &fence); > > + xe_bb_free(bb, fence); > > + dma_fence_put(fence); > > exit: > > return err; > > } > > > > static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri) > > { > > + struct dma_fence *fence; > > struct xe_bb *bb; > > int err; > > > > @@ -682,8 +677,9 @@ static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *re > > > > write_cs_mi_lri(bb, reg_lri, 1); > > > > - err = xe_oa_submit_bb(stream, bb); > > - xe_bb_free(bb, NULL); > > + err = xe_oa_submit_bb(stream, bb, &fence); > > + xe_bb_free(bb, fence); > > + dma_fence_put(fence); > > exit: > > return err; > > } > > @@ -913,15 +909,30 @@ static int xe_oa_emit_oa_config(struct xe_oa_stream *stream, struct xe_oa_config > > { > > #define NOA_PROGRAM_ADDITIONAL_DELAY_US 500 > > struct xe_oa_config_bo *oa_bo; > > - int err, us = NOA_PROGRAM_ADDITIONAL_DELAY_US; > > + int err = 0, us = NOA_PROGRAM_ADDITIONAL_DELAY_US; > > + struct dma_fence *fence; > > + long timeout; > > > > + /* Emit OA configuration batch */ > > oa_bo = xe_oa_alloc_config_buffer(stream, config); > > if (IS_ERR(oa_bo)) { > > err = PTR_ERR(oa_bo); > > goto exit; > > } > > > > - err = xe_oa_submit_bb(stream, oa_bo->bb); > > + err = xe_oa_submit_bb(stream, oa_bo->bb, &fence); > > + if (err) > > + goto exit; > > + > > + /* Wait till all previous batches have executed */ > > + timeout = dma_fence_wait_timeout(fence, false, 5 * HZ); > > + dma_fence_put(fence); > > + if (timeout < 0) > > + err = timeout; > > + else if (!timeout) > > + err = -ETIME; > > + if (err) > > + drm_dbg(&stream->oa->xe->drm, "dma_fence_wait_timeout err %d\n", err); > > > > /* Additional empirical delay needed for NOA programming after registers are written */ > > usleep_range(us, 2 * us); > > -- > > 2.41.0 > >