From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 944C7C433EF for ; Wed, 17 Nov 2021 18:49:55 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5B85661B71 for ; Wed, 17 Nov 2021 18:49:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 5B85661B71 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5715B6E7E2; Wed, 17 Nov 2021 18:49:50 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 72EAC6E7DA; Wed, 17 Nov 2021 18:49:48 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10171"; a="320246545" X-IronPort-AV: E=Sophos;i="5.87,241,1631602800"; d="scan'208";a="320246545" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Nov 2021 10:49:48 -0800 X-IronPort-AV: E=Sophos;i="5.87,241,1631602800"; d="scan'208";a="646128551" Received: from rstock-mobl.ger.corp.intel.com (HELO [10.249.254.164]) ([10.249.254.164]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Nov 2021 10:49:46 -0800 Message-ID: <9ea7ae8e-c9c4-8b1e-2057-5be69eb35555@linux.intel.com> Date: Wed, 17 Nov 2021 19:49:44 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 Content-Language: en-US To: Matthew Auld , intel-gfx@lists.freedesktop.org References: <20211117142024.1043017-1-matthew.auld@intel.com> From: =?UTF-8?Q?Thomas_Hellstr=c3=b6m?= In-Reply-To: <20211117142024.1043017-1-matthew.auld@intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Intel-gfx] [PATCH v2 1/6] drm/i915: move the pre_pin earlier X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On 11/17/21 15:20, Matthew Auld wrote: > In intel_context_do_pin_ww, when calling into the pre_pin hook(which is > passed the ww context) it could in theory return -EDEADLK(which is very > likely with debug kernels), once we start adding more ww locking in there, > like in the next patch. If so then we need to be mindful of having to > restart the do_pin at this point. > > If this is the kernel_context, or some other early in-kernel context > where we have yet to setup the default_state, then we always inhibit the > context restore, and instead rely on the delayed active_release to set > the CONTEXT_VALID_BIT for us(if we even care), which should indicate > that we have context switched away, and that our newly saved context > state should now be valid. However, since we currently grab the active > reference before the potential ww dance, we can end up setting the > CONTEXT_VALID_BIT much too early, if we need to backoff, and then upon > re-trying the do_pin, we could potentially cause the hardware to > incorrectly load some garbage context state when later context switching > to that context, but at the very least this will trigger the > GEM_BUG_ON() in __engine_unpark. For now let's just move any ww dance > stuff prior to arming the active reference. > > For normal user contexts this shouldn't be a concern, since we should > already have the default_state ready when initialising the lrc state, > and so there should be no concern with active_release somehow > prematurely setting the CONTEXT_VALID_BIT. > > v2(Thomas): > - Also re-order the union unwind > > Signed-off-by: Matthew Auld > Cc: Thomas Hellström > Cc: Maarten Lankhorst Reviewed-by: Thomas Hellström > --- > drivers/gpu/drm/i915/gt/intel_context.c | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c > index 5634d14052bc..4c296de1d67d 100644 > --- a/drivers/gpu/drm/i915/gt/intel_context.c > +++ b/drivers/gpu/drm/i915/gt/intel_context.c > @@ -228,17 +228,17 @@ int __intel_context_do_pin_ww(struct intel_context *ce, > if (err) > return err; > > - err = i915_active_acquire(&ce->active); > + err = ce->ops->pre_pin(ce, ww, &vaddr); > if (err) > goto err_ctx_unpin; > > - err = ce->ops->pre_pin(ce, ww, &vaddr); > + err = i915_active_acquire(&ce->active); > if (err) > - goto err_release; > + goto err_post_unpin; > > err = mutex_lock_interruptible(&ce->pin_mutex); > if (err) > - goto err_post_unpin; > + goto err_release; > > intel_engine_pm_might_get(ce->engine); > > @@ -273,11 +273,11 @@ int __intel_context_do_pin_ww(struct intel_context *ce, > > err_unlock: > mutex_unlock(&ce->pin_mutex); > +err_release: > + i915_active_release(&ce->active); > err_post_unpin: > if (!handoff) > ce->ops->post_unpin(ce); > -err_release: > - i915_active_release(&ce->active); > err_ctx_unpin: > intel_context_post_unpin(ce); >