From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 938C5C52D6F for ; Wed, 21 Aug 2024 15:28:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 60CED10E32A; Wed, 21 Aug 2024 15:28:42 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="IPQW0OJ/"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 327A010E31F for ; Wed, 21 Aug 2024 15:28:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724254113; x=1755790113; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=IGhe9a9l7nJmpZhQuO3tZynA/UgFaV1J4n6H+84+tnk=; b=IPQW0OJ/ixTH41XMsFPCh6aQ6J9Z8YUEjVvxC51JjULYWfkBxxNhN9d8 iIC0s4G6tRLMGRD+CWKeaJjdo9X3uAbic2z2us+Vnq2CuYmMg6525jshK p/p9LD1uUi5nCA3gIOVbPCa91043xV816YTnmrS7pV/vUMoHfNrmZ/Xo6 80NdHLbTm9l44DrfvQYbmSlWiG8CBHrddHtxwd6E7BuAI7SVzwNU16aJS CEVbZcLRkOTgQdjCXxw3WnupPBXvQSq00xh2vo3DTXSGl5kQSlEfeJgWD lNqfDVexQ3eOd2MShNjjozfBRrjSRDzW6QuYiE9XXYIdq38n1EBH2ZKb3 w==; X-CSE-ConnectionGUID: 7SrzDoH9TAa2EccAsS2QdA== X-CSE-MsgGUID: LnxMV2UnT4SQcurA3iOnnA== X-IronPort-AV: E=McAfee;i="6700,10204,11171"; a="22427397" X-IronPort-AV: E=Sophos;i="6.10,164,1719903600"; d="scan'208";a="22427397" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by orvoesa112.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Aug 2024 08:28:32 -0700 X-CSE-ConnectionGUID: Pr2bWif6QTi6dKeuWEl91g== X-CSE-MsgGUID: RsphQoxsSlOxqaqd+eXPxA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,164,1719903600"; d="scan'208";a="61877112" Received: from orsosgc001.jf.intel.com ([10.165.21.138]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Aug 2024 08:28:34 -0700 From: Ashutosh Dixit To: intel-xe@lists.freedesktop.org Cc: Matthew Brost , Jose Souza , Lionel Landwerlin , Umesh Nerlige Ramappa , Jonathan Cavitt Subject: [PATCH 1/7] drm/xe/oa: Separate batch submission from waiting for completion Date: Wed, 21 Aug 2024 08:28:24 -0700 Message-ID: <20240821152830.1495276-2-ashutosh.dixit@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20240821152830.1495276-1-ashutosh.dixit@intel.com> References: <20240821152830.1495276-1-ashutosh.dixit@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" When we introduce xe_syncs, we don't wait for internal OA programming batches to complete. That is, xe_syncs are signaled asynchronously. In anticipation for this, separate out batch submission from waiting for completion of those batches. v2: Change return type of xe_oa_submit_bb to "struct dma_fence *" (Matt B) Reviewed-by: Jonathan Cavitt Signed-off-by: Ashutosh Dixit --- drivers/gpu/drm/xe/xe_oa.c | 59 +++++++++++++++++++++++++++++--------- 1 file changed, 45 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c index 3ef92eb8fbb1e..08a1c8ce446c1 100644 --- a/drivers/gpu/drm/xe/xe_oa.c +++ b/drivers/gpu/drm/xe/xe_oa.c @@ -563,12 +563,11 @@ static __poll_t xe_oa_poll(struct file *file, poll_table *wait) return ret; } -static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) +static struct dma_fence *xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) { struct xe_sched_job *job; struct dma_fence *fence; - long timeout; - int err = 0; + int err; /* Kernel configuration is issued on stream->k_exec_q, not stream->exec_q */ job = xe_bb_create_job(stream->k_exec_q, bb); @@ -581,14 +580,9 @@ static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb) fence = dma_fence_get(&job->drm.s_fence->finished); xe_sched_job_push(job); - timeout = dma_fence_wait_timeout(fence, false, HZ); - dma_fence_put(fence); - if (timeout < 0) - err = timeout; - else if (!timeout) - err = -ETIME; + return fence; exit: - return err; + return ERR_PTR(err); } static void write_cs_mi_lri(struct xe_bb *bb, const struct xe_oa_reg *reg_data, u32 n_regs) @@ -652,6 +646,7 @@ static void xe_oa_store_flex(struct xe_oa_stream *stream, struct xe_lrc *lrc, static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lrc, const struct flex *flex, u32 count) { + struct dma_fence *fence; struct xe_bb *bb; int err; @@ -663,7 +658,16 @@ static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lr xe_oa_store_flex(stream, lrc, bb, flex, count); - err = xe_oa_submit_bb(stream, bb); + fence = xe_oa_submit_bb(stream, bb); + if (IS_ERR(fence)) { + err = PTR_ERR(fence); + goto free_bb; + } + xe_bb_free(bb, fence); + dma_fence_put(fence); + + return 0; +free_bb: xe_bb_free(bb, NULL); exit: return err; @@ -671,6 +675,7 @@ static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lr static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri) { + struct dma_fence *fence; struct xe_bb *bb; int err; @@ -682,7 +687,16 @@ static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *re write_cs_mi_lri(bb, reg_lri, 1); - err = xe_oa_submit_bb(stream, bb); + fence = xe_oa_submit_bb(stream, bb); + if (IS_ERR(fence)) { + err = PTR_ERR(fence); + goto free_bb; + } + xe_bb_free(bb, fence); + dma_fence_put(fence); + + return 0; +free_bb: xe_bb_free(bb, NULL); exit: return err; @@ -913,15 +927,32 @@ static int xe_oa_emit_oa_config(struct xe_oa_stream *stream, struct xe_oa_config { #define NOA_PROGRAM_ADDITIONAL_DELAY_US 500 struct xe_oa_config_bo *oa_bo; - int err, us = NOA_PROGRAM_ADDITIONAL_DELAY_US; + int err = 0, us = NOA_PROGRAM_ADDITIONAL_DELAY_US; + struct dma_fence *fence; + long timeout; + /* Emit OA configuration batch */ oa_bo = xe_oa_alloc_config_buffer(stream, config); if (IS_ERR(oa_bo)) { err = PTR_ERR(oa_bo); goto exit; } - err = xe_oa_submit_bb(stream, oa_bo->bb); + fence = xe_oa_submit_bb(stream, oa_bo->bb); + if (IS_ERR(fence)) { + err = PTR_ERR(fence); + goto exit; + } + + /* Wait till all previous batches have executed */ + timeout = dma_fence_wait_timeout(fence, false, 5 * HZ); + dma_fence_put(fence); + if (timeout < 0) + err = timeout; + else if (!timeout) + err = -ETIME; + if (err) + drm_dbg(&stream->oa->xe->drm, "dma_fence_wait_timeout err %d\n", err); /* Additional empirical delay needed for NOA programming after registers are written */ usleep_range(us, 2 * us); -- 2.41.0