From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 365E2EEAA71 for ; Thu, 14 Sep 2023 20:40:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 01F7E10E583; Thu, 14 Sep 2023 20:40:46 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6CE0310E57E for ; Thu, 14 Sep 2023 20:40:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694724041; x=1726260041; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Kug0bUj/T1lkwgMmsBXOPjEOYg6B0d+TkZqQCfI3Zdw=; b=C8kocfzdlC4+UwVLY6Zu7DfVn5/5ORGJLQ7GLoIZDyxq9j3Pd4LeArTc KaWhAEgW6UvgE5pZZwOGOmV1TlyahPrtdCw/4gYdCS5aR6b9zPnwOvhAO pe90kaqWksMuXPeZjM/KCRkkIX786u6yte6clo9IpmBlzHETiktRZGhHq 6jatREJZ/qajg5fDVWOrR+GPnVwlwpKmD+b2jWWL6OYH/IvJSE0kiWhpE mYgfeiMBg42/rAYYnnBekhllH9Mb7zi+mdlwmHpjStNP9VeODeolrzMjD /66Pd0nppjHK+Byb/js13G7pTjGPFDF1nX1h0q1xreVOmXk+MDqpfbDVA A==; X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="369369367" X-IronPort-AV: E=Sophos;i="6.02,146,1688454000"; d="scan'208";a="369369367" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 13:40:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10833"; a="694453587" X-IronPort-AV: E=Sophos;i="6.02,146,1688454000"; d="scan'208";a="694453587" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 13:40:39 -0700 From: Matthew Brost To: Date: Thu, 14 Sep 2023 13:40:53 -0700 Message-Id: <20230914204053.2220281-7-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230914204053.2220281-1-matthew.brost@intel.com> References: <20230914204053.2220281-1-matthew.brost@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [Intel-xe] [PATCH 6/6] drm/xe: Allow num_batch_buffer == 0 in exec IOCTL X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" The idea being out-syncs can signal indicating all previous operations on the exec queue are complete. An example use case of this would be support for implementing vkQueueWaitForIdle easily. v2: Don't add last_fence for VM's that do not support dma fences Signed-off-by: Matthew Brost --- drivers/gpu/drm/xe/xe_exec.c | 22 +++++++++++++++++++--- drivers/gpu/drm/xe/xe_exec_queue.c | 5 ++++- drivers/gpu/drm/xe/xe_exec_queue_types.h | 5 +++-- drivers/gpu/drm/xe/xe_sync.c | 5 ++++- drivers/gpu/drm/xe/xe_sync.h | 2 +- drivers/gpu/drm/xe/xe_vm.c | 2 +- 6 files changed, 32 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_exec.c b/drivers/gpu/drm/xe/xe_exec.c index 28e84a0bbeb0..4666f5b145f7 100644 --- a/drivers/gpu/drm/xe/xe_exec.c +++ b/drivers/gpu/drm/xe/xe_exec.c @@ -161,7 +161,8 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file) if (XE_IOCTL_DBG(xe, q->flags & EXEC_QUEUE_FLAG_VM)) return -EINVAL; - if (XE_IOCTL_DBG(xe, q->width != args->num_batch_buffer)) + if (XE_IOCTL_DBG(xe, args->num_batch_buffer && + q->width != args->num_batch_buffer)) return -EINVAL; if (XE_IOCTL_DBG(xe, q->flags & EXEC_QUEUE_FLAG_BANNED)) { @@ -182,12 +183,13 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file) for (i = 0; i < args->num_syncs; i++) { err = xe_sync_entry_parse(xe, xef, &syncs[num_syncs++], &syncs_user[i], true, - xe_vm_no_dma_fences(vm)); + xe_vm_no_dma_fences(vm), + !args->num_batch_buffer); if (err) goto err_syncs; } - if (xe_exec_queue_is_parallel(q)) { + if (args->num_batch_buffer && xe_exec_queue_is_parallel(q)) { err = __copy_from_user(addresses, addresses_user, sizeof(u64) * q->width); if (err) { @@ -234,6 +236,18 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file) goto err_exec; } + if (!args->num_batch_buffer) { + if (!xe_vm_no_dma_fences(vm)) { + struct dma_fence *fence = + xe_exec_queue_last_fence_get(q, vm); + + for (i = 0; i < num_syncs; i++) + xe_sync_entry_signal(&syncs[i], NULL, fence); + } + + goto err_exec; + } + if (xe_exec_queue_is_lr(q) && xe_exec_queue_ring_full(q)) { err = -EWOULDBLOCK; goto err_exec; @@ -327,6 +341,8 @@ int xe_exec_ioctl(struct drm_device *dev, void *data, struct drm_file *file) if (xe_exec_queue_is_lr(q)) q->ring_ops->emit_job(job); + if (!xe_vm_no_dma_fences(vm)) + xe_exec_queue_last_fence_set(q, vm, &job->drm.s_fence->finished); xe_sched_job_push(job); xe_vm_reactivate_rebind(vm); diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index 8722ab6ba00a..9fe91f66f776 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -964,7 +964,10 @@ int xe_exec_queue_set_property_ioctl(struct drm_device *dev, void *data, static void xe_exec_queue_last_fence_lockdep_assert(struct xe_exec_queue *q, struct xe_vm *vm) { - lockdep_assert_held_write(&vm->lock); + if (q->flags & EXEC_QUEUE_FLAG_VM) + lockdep_assert_held_write(&vm->lock); + else + xe_vm_assert_held(vm); } /** diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h index 71ed8d22a8a1..9648b2bbabc9 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue_types.h +++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h @@ -53,8 +53,9 @@ struct xe_exec_queue { struct xe_hw_fence_irq *fence_irq; /** - * @last_fence: last fence on engine, protected by vm->lock in write - * mode if bind engine + * @last_fence: last fence on exec queue, protected by vm->lock in write + * mode if bind exec queue, protected by dma resv lock if non-bind exec + * queue */ struct dma_fence *last_fence; diff --git a/drivers/gpu/drm/xe/xe_sync.c b/drivers/gpu/drm/xe/xe_sync.c index 73ef259aa387..2461e7d4814c 100644 --- a/drivers/gpu/drm/xe/xe_sync.c +++ b/drivers/gpu/drm/xe/xe_sync.c @@ -100,7 +100,7 @@ static void user_fence_cb(struct dma_fence *fence, struct dma_fence_cb *cb) int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef, struct xe_sync_entry *sync, struct drm_xe_sync __user *sync_user, - bool exec, bool no_dma_fences) + bool exec, bool no_dma_fences, bool exec_nop) { struct drm_xe_sync sync_in; int err; @@ -171,6 +171,9 @@ int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef, break; case DRM_XE_SYNC_USER_FENCE: + if (XE_IOCTL_DBG(xe, exec_nop)) + return -EOPNOTSUPP; + if (XE_IOCTL_DBG(xe, !signal)) return -EOPNOTSUPP; diff --git a/drivers/gpu/drm/xe/xe_sync.h b/drivers/gpu/drm/xe/xe_sync.h index 30958ddc4cdc..98f02bb34637 100644 --- a/drivers/gpu/drm/xe/xe_sync.h +++ b/drivers/gpu/drm/xe/xe_sync.h @@ -15,7 +15,7 @@ struct xe_sched_job; int xe_sync_entry_parse(struct xe_device *xe, struct xe_file *xef, struct xe_sync_entry *sync, struct drm_xe_sync __user *sync_user, - bool exec, bool compute_mode); + bool exec, bool compute_mode, bool exec_nop); int xe_sync_entry_wait(struct xe_sync_entry *sync); int xe_sync_entry_add_deps(struct xe_sync_entry *sync, struct xe_sched_job *job); diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c index 0e2f3ab453ea..c2526950cf60 100644 --- a/drivers/gpu/drm/xe/xe_vm.c +++ b/drivers/gpu/drm/xe/xe_vm.c @@ -2916,7 +2916,7 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file) for (num_syncs = 0; num_syncs < args->num_syncs; num_syncs++) { err = xe_sync_entry_parse(xe, xef, &syncs[num_syncs], &syncs_user[num_syncs], false, - xe_vm_no_dma_fences(vm)); + xe_vm_no_dma_fences(vm), false); if (err) goto free_syncs; } -- 2.34.1