From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 31324C4321E for ; Thu, 1 Dec 2022 02:18:53 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 315F488C3D; Thu, 1 Dec 2022 02:18:52 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id F3C8C10E529 for ; Thu, 1 Dec 2022 02:18:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1669861129; x=1701397129; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=9WCG1Df2ovjoWZEjdnTvoCjfz+Lyr5CqSzPZbCFVJSI=; b=ksxFAYZrf1BPY/70vxGl14LSkF1vq6b/gl/vJBsuADqXWHJq6Gzut6O/ ZwPFPgDyUGIplxtFpQ8hyd8NODrpAX4ATPbkdeyfeCMYFMwBFDiZzLS6z T1jPOS0lPw/3g+WJoiVo5nCbR+jJuOeSoEQv+DmJW8mpEVE7LAdx958sG s/j64aG2toSL/mQSPHTH7sFR1cqaT7IeGEFLX7DkRW7o5x+s5rY9Wje3H fYb/THXxUNK82grX95eIBSPbNf0YFX29oFQO2bGOLN/CebKQfU0QvHF8o 8TQWLXxmMBWaT+f/djzOkvl1ak2Gov29RBygUuSNYJl0iWp2JPRiEfiBL g==; X-IronPort-AV: E=McAfee;i="6500,9779,10547"; a="377731902" X-IronPort-AV: E=Sophos;i="5.96,207,1665471600"; d="scan'208";a="377731902" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Nov 2022 18:18:48 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10547"; a="786708293" X-IronPort-AV: E=Sophos;i="5.96,207,1665471600"; d="scan'208";a="786708293" Received: from adixit-mobl.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.212.147.254]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Nov 2022 18:18:47 -0800 Date: Wed, 30 Nov 2022 18:18:13 -0800 Message-ID: <87tu2f3lzu.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Umesh Nerlige Ramappa In-Reply-To: <20221201010535.1097741-2-umesh.nerlige.ramappa@intel.com> References: <20221201010535.1097741-1-umesh.nerlige.ramappa@intel.com> <20221201010535.1097741-2-umesh.nerlige.ramappa@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Subject: Re: [Intel-gfx] [PATCH v2 1/4] drm/i915/mtl: Resize noa_wait BO size to save restore GPR regs X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-gfx@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Wed, 30 Nov 2022 17:05:32 -0800, Umesh Nerlige Ramappa wrote: > > On MTL, gt->scratch was using stolen lmem. An MI_SRM to stolen lmem > caused a hang that was attributed to saving and restoring the GPR > registers used for noa_wait. > > Add an additional page in noa_wait BO to save/restore GPR registers for > the noa_wait logic. Mostly copying R-b's from https://patchwork.freedesktop.org/series/111411/ here. Reviewed-by: Ashutosh Dixit > > Signed-off-by: Umesh Nerlige Ramappa > --- > drivers/gpu/drm/i915/gt/intel_gt_types.h | 6 ------ > drivers/gpu/drm/i915/i915_perf.c | 25 ++++++++++++++++-------- > 2 files changed, 17 insertions(+), 14 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h > index c1d9cd255e06..13dffe0a3d20 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h > +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h > @@ -296,12 +296,6 @@ enum intel_gt_scratch_field { > > /* 8 bytes */ > INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA = 256, > - > - /* 6 * 8 bytes */ > - INTEL_GT_SCRATCH_FIELD_PERF_CS_GPR = 2048, > - > - /* 4 bytes */ > - INTEL_GT_SCRATCH_FIELD_PERF_PREDICATE_RESULT_1 = 2096, > }; > > #endif /* __INTEL_GT_TYPES_H__ */ > diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c > index 00e09bb18b13..7790a88f10d8 100644 > --- a/drivers/gpu/drm/i915/i915_perf.c > +++ b/drivers/gpu/drm/i915/i915_perf.c > @@ -1842,8 +1842,7 @@ static u32 *save_restore_register(struct i915_perf_stream *stream, u32 *cs, > for (d = 0; d < dword_count; d++) { > *cs++ = cmd; > *cs++ = i915_mmio_reg_offset(reg) + 4 * d; > - *cs++ = intel_gt_scratch_offset(stream->engine->gt, > - offset) + 4 * d; > + *cs++ = i915_ggtt_offset(stream->noa_wait) + offset + 4 * d; > *cs++ = 0; > } > > @@ -1876,7 +1875,13 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) > MI_PREDICATE_RESULT_2_ENGINE(base) : > MI_PREDICATE_RESULT_1(RENDER_RING_BASE); > > - bo = i915_gem_object_create_internal(i915, 4096); > + /* > + * gt->scratch was being used to save/restore the GPR registers, but on > + * MTL the scratch uses stolen lmem. An MI_SRM to this memory region > + * causes an engine hang. Instead allocate an additional page here to > + * save/restore GPR registers > + */ > + bo = i915_gem_object_create_internal(i915, 8192); > if (IS_ERR(bo)) { > drm_err(&i915->drm, > "Failed to allocate NOA wait batchbuffer\n"); > @@ -1910,14 +1915,19 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) > goto err_unpin; > } > > + stream->noa_wait = vma; > + > +#define GPR_SAVE_OFFSET 4096 > +#define PREDICATE_SAVE_OFFSET 4160 > + > /* Save registers. */ > for (i = 0; i < N_CS_GPR; i++) > cs = save_restore_register( > stream, cs, true /* save */, CS_GPR(i), > - INTEL_GT_SCRATCH_FIELD_PERF_CS_GPR + 8 * i, 2); > + GPR_SAVE_OFFSET + 8 * i, 2); > cs = save_restore_register( > stream, cs, true /* save */, mi_predicate_result, > - INTEL_GT_SCRATCH_FIELD_PERF_PREDICATE_RESULT_1, 1); > + PREDICATE_SAVE_OFFSET, 1); > > /* First timestamp snapshot location. */ > ts0 = cs; > @@ -2033,10 +2043,10 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) > for (i = 0; i < N_CS_GPR; i++) > cs = save_restore_register( > stream, cs, false /* restore */, CS_GPR(i), > - INTEL_GT_SCRATCH_FIELD_PERF_CS_GPR + 8 * i, 2); > + GPR_SAVE_OFFSET + 8 * i, 2); > cs = save_restore_register( > stream, cs, false /* restore */, mi_predicate_result, > - INTEL_GT_SCRATCH_FIELD_PERF_PREDICATE_RESULT_1, 1); > + PREDICATE_SAVE_OFFSET, 1); > > /* And return to the ring. */ > *cs++ = MI_BATCH_BUFFER_END; > @@ -2046,7 +2056,6 @@ static int alloc_noa_wait(struct i915_perf_stream *stream) > i915_gem_object_flush_map(bo); > __i915_gem_object_release_map(bo); > > - stream->noa_wait = vma; > goto out_ww; > > err_unpin: > -- > 2.36.1 >