From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9351CD75BB2 for ; Thu, 21 Nov 2024 04:22:07 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3F4FE10E857; Thu, 21 Nov 2024 04:22:07 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="RyX6FfXb"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id B175510E857 for ; Thu, 21 Nov 2024 04:22:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1732162926; x=1763698926; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=HU6+/XFIZ0MSqaHaJWWq8DO3dkcafW2pGMjUPK5hAlM=; b=RyX6FfXbHia4DySOqYlcmwYPDcJt3x4HRF1oaIM4r4qfI5qtXt8+eYcw Ge83SOl/y86WBBOMAuvFWwpJUqxEohMefJp6BtSMa0Vc9hJamKJA93k/0 3bespovFZet3ttDk58BG1CEV3OAMlWHb+fYekBXZrkjojnewCy5KXMWUE rg1UCCQbl7GC6+NCdWS11OFpsDE1K9Kv1DmvCPywCFuNmHdOZ5AjwZbMJ qnj5/Oq/RF8oOfJoiG6Hc/RGfIYpm1JjQEmp6Q9xJgODCk+hCW23qnqyI 4rI7un/MybULOx0oCPQ4E9O2KXpZU72qNoxIab23lTKG3xWF/IXdiBMVg g==; X-CSE-ConnectionGUID: UwwENCMURVCppak2OVWA3w== X-CSE-MsgGUID: ggb4C8YQRKyDrVB3w5rqFw== X-IronPort-AV: E=McAfee;i="6700,10204,11262"; a="32001426" X-IronPort-AV: E=Sophos;i="6.12,171,1728975600"; d="scan'208";a="32001426" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by orvoesa112.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2024 20:22:06 -0800 X-CSE-ConnectionGUID: 9dEMcTe0T+y6mW5tB1exdQ== X-CSE-MsgGUID: ZdkHjJy6QfWUxLP+HfAUtg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,171,1728975600"; d="scan'208";a="90246386" Received: from orsosgc001.jf.intel.com (HELO orsosgc001.intel.com) ([10.165.21.142]) by orviesa006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2024 20:22:06 -0800 Date: Wed, 20 Nov 2024 20:22:05 -0800 Message-ID: <85y11djjwy.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Matthew Brost Cc: intel-xe@lists.freedesktop.org, Umesh Nerlige Ramappa , Jonathan Cavitt Subject: Re: [PATCH] drm/xe/oa: Disallow OA from being enabled on active exec_queue's In-Reply-To: References: <20241119013256.680030-1-ashutosh.dixit@intel.com> <851pz76ie6.wl-ashutosh.dixit@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-redhat-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Tue, 19 Nov 2024 14:09:07 -0800, Matthew Brost wrote: > Hi Matt, > On Tue, Nov 19, 2024 at 01:08:49PM -0800, Dixit, Ashutosh wrote: > > On Tue, 19 Nov 2024 06:44:51 -0800, Matthew Brost wrote: > > > > > > On Mon, Nov 18, 2024 at 05:32:56PM -0800, Ashutosh Dixit wrote: > > > > Enabling OA on an exec_queue toggles the OAC_CONTEXT_ENABLE bit in > > > > CTXT_SR_CTL register. Toggling this bit changes the size and layout of the > > > > underlying HW context image. Therefore, enabling OA on an already active > > > > exec_queue (as currently implemented in xe) is an invalid operation and can > > > > cause hangs. Therefore, disallow OA from being enabled on active > > > > exec_queue's (here, by active we mean a context on which submissions have > > > > previously happened). > > > > > > > > > > This is something we will need to keep on eye on then because in various > > > experimental code I've played around enabling exec queues upon creation. > > > e.g., If we want to allocate a doorbell. I seem to recall Habana wanting > > > to enable exec queues upon creation too. > > > > The real requirement here is that HW context image should not have been > > loaded before OA is enabled on the exec queue. That is what happens today > > in the ENABLED state, correct, when user space submissions start? > > > > Yea, that is the current flow. If we change this to enable an exec queue > upon allocation we probably could check for ring head & tail == 0. > > Also note LR queues with preemption fences do toggle the enabled state > too. Does OA apply to those queues? If so, a registered check may be > better than enabled or maybe just start with ring head & tail == 0. Yes I could do 'ring head & tail == 0' but not sure how it works, since ring head and tail could have wrapped round back to 0 (giving the false impression that exec_queue was not active when it was). I am assuming you mean the ring is "0 to N" rather than "gtt_addr to gtt_addr + N", correct? Thanks. -- Ashutosh > > If operations such as doorbell are only management requests to GuC (which > > don't cause HW context image to be loaded) and if we can name a new state > > when the exec queue is handed off to userspace for starting submissions, we > > should be able to stay with this approach. > > > > > Just curious if it was ever explored having exec queue creation > > > extension which enables OA? It seems like this is something we may need > > > at some point if our exec queue creation semantics change of course > > > being careful to not break existing flows. > > > > Yeah I did think of it but didn't want to change the uapi. > > > > Makes sense. > > > Also, a different implementation is possible which avoids this resizing of > > the context image altogether. It requires the kernel OA code submit its > > submissions on the user exec queue (and use that exec queue's VM, currently > > OA code uses a kernel exec queue). There are some reasons I don't want to > > implement that just yet, but worst case, we can do that if absolutely > > needed. > > > > Thanks. > > -- > > Ashutosh > > > > > > Transition from 1 -> 0 for this bit was disallowed in > > > > '0c8650b09a36 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream > > > > close")'. Here we disallow the 0 -> 1 transition on active contexts. > > > > > > > > v2: Don't export exec_queue_enabled, define new xe_exec_queue_op (M Brost) > > > > Directly check OAC_CONTEXT_ENABLE bit from context image (J Cavitt) > > > > > > > > Bspec: 60314 > > > > Fixes: 2f4a730fcd2d ("drm/xe/oa: Add OAR support") > > > > Cc: stable@vger.kernel.org > > > > Signed-off-by: Ashutosh Dixit > > > > --- > > > > drivers/gpu/drm/xe/xe_exec_queue_types.h | 2 ++ > > > > drivers/gpu/drm/xe/xe_guc_submit.c | 1 + > > > > drivers/gpu/drm/xe/xe_oa.c | 13 +++++++++++++ > > > > 3 files changed, 16 insertions(+) > > > > > > > > diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h b/drivers/gpu/drm/xe/xe_exec_queue_types.h > > > > index 1158b6062a6cd..b88d617c37b33 100644 > > > > --- a/drivers/gpu/drm/xe/xe_exec_queue_types.h > > > > +++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h > > > > @@ -184,6 +184,8 @@ struct xe_exec_queue_ops { > > > > void (*resume)(struct xe_exec_queue *q); > > > > /** @reset_status: check exec queue reset status */ > > > > bool (*reset_status)(struct xe_exec_queue *q); > > > > + /** @enabled: check if exec queue is in enabled state */ > > > > + bool (*enabled)(struct xe_exec_queue *q); > > > > }; > > > > > > > > #endif > > > > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c > > > > index f9ecee5364d82..b9b9cdb6f768b 100644 > > > > --- a/drivers/gpu/drm/xe/xe_guc_submit.c > > > > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c > > > > @@ -1660,6 +1660,7 @@ static const struct xe_exec_queue_ops guc_exec_queue_ops = { > > > > .suspend_wait = guc_exec_queue_suspend_wait, > > > > .resume = guc_exec_queue_resume, > > > > .reset_status = guc_exec_queue_reset_status, > > > > + .enabled = exec_queue_enabled, > > > > }; > > > > > > > > static void guc_exec_queue_stop(struct xe_guc *guc, struct xe_exec_queue *q) > > > > diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c > > > > index 8dd55798ab312..4a7440c40978c 100644 > > > > --- a/drivers/gpu/drm/xe/xe_oa.c > > > > +++ b/drivers/gpu/drm/xe/xe_oa.c > > > > @@ -2066,6 +2066,19 @@ int xe_oa_stream_open_ioctl(struct drm_device *dev, u64 data, struct drm_file *f > > > > if (XE_IOCTL_DBG(oa->xe, !param.exec_q)) > > > > return -ENOENT; > > > > > > > > + /* > > > > + * Disallow OA from being enabled on active exec_queue's. Enabling OA sets the > > > > + * OAC_CONTEXT_ENABLE bit in CTXT_SR_CTL register. Toggling the bit changes > > > > + * the size and layout of the underlying HW context image and can cause hangs. > > > > + */ > > > > + if (XE_IOCTL_DBG(oa->xe, > > > > + !(xe_lrc_read_ctx_reg(param.exec_q->lrc[0], > > > > + CTX_CONTEXT_CONTROL) & CTX_CTRL_OAC_CONTEXT_ENABLE) && > > > > + param.exec_q->ops->enabled(param.exec_q))) { > > > > + ret = -EADDRINUSE; > > > > + goto err_exec_q; > > > > + } > > > > + > > > > if (param.exec_q->width > 1) > > > > drm_dbg(&oa->xe->drm, "exec_q->width > 1, programming only exec_q->lrc[0]\n"); > > > > } > > > > -- > > > > 2.41.0 > > > >