From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CBEB0C6FA82 for ; Fri, 9 Sep 2022 23:48:26 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9A9FE10E1C2; Fri, 9 Sep 2022 23:48:25 +0000 (UTC) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7A10C10E1C2 for ; Fri, 9 Sep 2022 23:48:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1662767301; x=1694303301; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=Cer81Gv8Kc9QRT9auaFjJWhiUj4u5Ednc7ZwgQNyYy0=; b=g1PXgkAAfbqyZnyQfZfOxysZVXSX5xQcgAcdX+ZG/C+lPE1D06PoMGtH 0DaubuJJToC2IgiqEONK7D9OAWhRgDro3ZufGC7MEmQHJ9YmqjqNDfaZA E/isaFcLk/kb4fX/hEeVhCaZ1iYSxzA2Hs3oxlBmshwJGFrA34GZd0czQ 3TD6RH4K+MCTcP/s1Fv9xUCJGFpuKuBLjsfMEVWxplnTBtItol8hNEJol x7vLpczm6kh8YdA0PCnspVVtnas1f2KniLMRp44VgrKFOcU0xgaVRkfD3 8yqOMP8yYcX4Pkk60ovXtXyZM72OmjdK6B7JnQ/FdkENkJS4Z3PADrGWo A==; X-IronPort-AV: E=McAfee;i="6500,9779,10465"; a="295175088" X-IronPort-AV: E=Sophos;i="5.93,304,1654585200"; d="scan'208";a="295175088" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Sep 2022 16:48:20 -0700 X-IronPort-AV: E=Sophos;i="5.93,304,1654585200"; d="scan'208";a="704567489" Received: from eekitend-mobl1.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.212.222.219]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Sep 2022 16:48:20 -0700 Date: Fri, 09 Sep 2022 16:47:36 -0700 Message-ID: <87mtb85cvb.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Umesh Nerlige Ramappa In-Reply-To: <20220823204155.8178-2-umesh.nerlige.ramappa@intel.com> References: <20220823204155.8178-1-umesh.nerlige.ramappa@intel.com> <20220823204155.8178-2-umesh.nerlige.ramappa@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Subject: Re: [Intel-gfx] [PATCH 01/19] drm/i915/perf: Fix OA filtering logic for GuC mode X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-gfx@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Tue, 23 Aug 2022 13:41:37 -0700, Umesh Nerlige Ramappa wrote: > Hi Umesh, > With GuC mode of submission, GuC is in control of defining the context id field > that is part of the OA reports. To filter reports, UMD and KMD must know what sw > context id was chosen by GuC. There is not interface between KMD and GuC to > determine this, so read the upper-dword of EXECLIST_STATUS to filter/squash OA > reports for the specific context. Do you think it is worth defining an interface for GuC to return the sw ctx_id it will be using for a ctx, say at ctx registration time? The scheme implemented in this patch to read the ctx_id is certainly very clever, at least to me. But as Lionel was saying is it a agreed upon immutable interface? If it is, we can go with this patch. (Though even then we will need to maintain this code even if in the future GuC FW is changed to return the ctx_id in order to preserve backwards comptability with previous GuC versions. So maybe better to have a real interface between GuC and KMD earlier rather than later?). Also a couple of general comments below. > > Signed-off-by: Umesh Nerlige Ramappa > --- > drivers/gpu/drm/i915/gt/intel_lrc.h | 2 + > drivers/gpu/drm/i915/i915_perf.c | 141 ++++++++++++++++++++++++---- > 2 files changed, 124 insertions(+), 19 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h > index a390f0813c8b..7111bae759f3 100644 > --- a/drivers/gpu/drm/i915/gt/intel_lrc.h > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.h > @@ -110,6 +110,8 @@ enum { > #define XEHP_SW_CTX_ID_WIDTH 16 > #define XEHP_SW_COUNTER_SHIFT 58 > #define XEHP_SW_COUNTER_WIDTH 6 > +#define GEN12_GUC_SW_CTX_ID_SHIFT 39 > +#define GEN12_GUC_SW_CTX_ID_WIDTH 16 > > static inline void lrc_runtime_start(struct intel_context *ce) > { > diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c > index f3c23fe9ad9c..735244a3aedd 100644 > --- a/drivers/gpu/drm/i915/i915_perf.c > +++ b/drivers/gpu/drm/i915/i915_perf.c > @@ -1233,6 +1233,125 @@ static struct intel_context *oa_pin_context(struct i915_perf_stream *stream) > return stream->pinned_ctx; > } > > +static int > +__store_reg_to_mem(struct i915_request *rq, i915_reg_t reg, u32 ggtt_offset) > +{ > + u32 *cs, cmd; > + > + cmd = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT; > + if (GRAPHICS_VER(rq->engine->i915) >= 8) > + cmd++; > + > + cs = intel_ring_begin(rq, 4); > + if (IS_ERR(cs)) > + return PTR_ERR(cs); > + > + *cs++ = cmd; > + *cs++ = i915_mmio_reg_offset(reg); > + *cs++ = ggtt_offset; > + *cs++ = 0; > + > + intel_ring_advance(rq, cs); > + > + return 0; > +} > + > +static int > +__read_reg(struct intel_context *ce, i915_reg_t reg, u32 ggtt_offset) > +{ > + struct i915_request *rq; > + int err; > + > + rq = i915_request_create(ce); > + if (IS_ERR(rq)) > + return PTR_ERR(rq); > + > + i915_request_get(rq); > + > + err = __store_reg_to_mem(rq, reg, ggtt_offset); > + > + i915_request_add(rq); > + if (!err && i915_request_wait(rq, 0, HZ / 2) < 0) > + err = -ETIME; > + > + i915_request_put(rq); > + > + return err; > +} > + > +static int > +gen12_guc_sw_ctx_id(struct intel_context *ce, u32 *ctx_id) > +{ > + struct i915_vma *scratch; > + u32 *val; > + int err; > + > + scratch = __vm_create_scratch_for_read_pinned(&ce->engine->gt->ggtt->vm, 4); > + if (IS_ERR(scratch)) > + return PTR_ERR(scratch); > + > + err = i915_vma_sync(scratch); > + if (err) > + goto err_scratch; > + > + err = __read_reg(ce, RING_EXECLIST_STATUS_HI(ce->engine->mmio_base), > + i915_ggtt_offset(scratch)); Actually the RING_EXECLIST_STATUS_HI is MMIO so can be read using say ENGINE_READ/intel_uncore_read. The only issue is how to read it when this ctx is scheduled which is cleverly solved by the scheme above. But I am not sure if there is any other simpler way to do it. > + if (err) > + goto err_scratch; > + > + val = i915_gem_object_pin_map_unlocked(scratch->obj, I915_MAP_WB); > + if (IS_ERR(val)) { > + err = PTR_ERR(val); > + goto err_scratch; > + } > + > + *ctx_id = *val; > + i915_gem_object_unpin_map(scratch->obj); > + > +err_scratch: > + i915_vma_unpin_and_release(&scratch, 0); > + return err; > +} > + > +/* > + * For execlist mode of submission, pick an unused context id > + * 0 - (NUM_CONTEXT_TAG -1) are used by other contexts > + * XXX_MAX_CONTEXT_HW_ID is used by idle context > + * > + * For GuC mode of submission read context id from the upper dword of the > + * EXECLIST_STATUS register. > + */ > +static int gen12_get_render_context_id(struct i915_perf_stream *stream) > +{ > + u32 ctx_id, mask; > + int ret; > + > + if (intel_engine_uses_guc(stream->engine)) { > + ret = gen12_guc_sw_ctx_id(stream->pinned_ctx, &ctx_id); > + if (ret) > + return ret; > + > + mask = ((1U << GEN12_GUC_SW_CTX_ID_WIDTH) - 1) << > + (GEN12_GUC_SW_CTX_ID_SHIFT - 32); > + } else if (GRAPHICS_VER_FULL(stream->engine->i915) >= IP_VER(12, 50)) { > + ctx_id = (XEHP_MAX_CONTEXT_HW_ID - 1) << > + (XEHP_SW_CTX_ID_SHIFT - 32); > + > + mask = ((1U << XEHP_SW_CTX_ID_WIDTH) - 1) << > + (XEHP_SW_CTX_ID_SHIFT - 32); > + } else { > + ctx_id = (GEN12_MAX_CONTEXT_HW_ID - 1) << > + (GEN11_SW_CTX_ID_SHIFT - 32); > + > + mask = ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << > + (GEN11_SW_CTX_ID_SHIFT - 32); Previously I missed that these ctx_id's for non-GuC cases are just constants. How does it work in these cases? Thanks. -- Ashutosh