From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE853C433DB for ; Wed, 24 Mar 2021 17:15:52 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9390861A15 for ; Wed, 24 Mar 2021 17:15:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9390861A15 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BF5B66ECB1; Wed, 24 Mar 2021 17:15:50 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 70F736ECAF; Wed, 24 Mar 2021 17:15:48 +0000 (UTC) IronPort-SDR: MIqkBXbZwIIdcdlOowSPROxYjWzKGZ0338AuHbfDs3yqv5jwQQZyyQLVHZqCghLxmE2fubd/W8 H3P4Az++Lolw== X-IronPort-AV: E=McAfee;i="6000,8403,9933"; a="190171790" X-IronPort-AV: E=Sophos;i="5.81,275,1610438400"; d="scan'208";a="190171790" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Mar 2021 10:15:39 -0700 IronPort-SDR: JyMYzMGxr45YEnkjY8D+JDSdmAAmg1MSMtQTDvwcAGgUQpjzPCgtrL6fhvn3clxp8ws+cfqoBt 5UOfKbaEFi1Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.81,275,1610438400"; d="scan'208";a="391375598" Received: from stinkbox.fi.intel.com (HELO stinkbox) ([10.237.72.171]) by orsmga002.jf.intel.com with SMTP; 24 Mar 2021 10:15:37 -0700 Received: by stinkbox (sSMTP sendmail emulation); Wed, 24 Mar 2021 19:15:36 +0200 Date: Wed, 24 Mar 2021 19:15:36 +0200 From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= To: Daniel Vetter Message-ID: References: <20210323155059.628690-1-maarten.lankhorst@linux.intel.com> <20210323155059.628690-64-maarten.lankhorst@linux.intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Patchwork-Hint: comment Subject: Re: [Intel-gfx] [PATCH v9 63/70] drm/i915: Move gt_revoke() slightly X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Thomas =?iso-8859-1?Q?Hellstr=F6m?= , intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Wed, Mar 24, 2021 at 06:00:12PM +0100, Daniel Vetter wrote: > On Tue, Mar 23, 2021 at 04:50:52PM +0100, Maarten Lankhorst wrote: > > We get a lockdep splat when the reset mutex is held, because it can be > > taken from fence_wait. This conflicts with the mmu notifier we have, > > because we recurse between reset mutex and mmap lock -> mmu notifier. > > = > > Remove this recursion by calling revoke_mmaps before taking the lock. > > = > > The reset code still needs fixing, as taking mmap locks during reset > > is not allowed. > > = > > Signed-off-by: Maarten Lankhorst > > Reviewed-by: Thomas Hellstr=F6m > > --- > > drivers/gpu/drm/i915/gt/intel_reset.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > = > > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i9= 15/gt/intel_reset.c > > index 990cb4adbb9a..447f589750c2 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_reset.c > > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c > > @@ -970,8 +970,6 @@ static int do_reset(struct intel_gt *gt, intel_engi= ne_mask_t stalled_mask) > > { > > int err, i; > > = > > - gt_revoke(gt); > > - > > err =3D __intel_gt_reset(gt, ALL_ENGINES); > > for (i =3D 0; err && i < RESET_MAX_RETRIES; i++) { > > msleep(10 * (i + 1)); > > @@ -1026,6 +1024,9 @@ void intel_gt_reset(struct intel_gt *gt, > > = > > might_sleep(); > > GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, >->reset.flags)); > > + > = > I've added a FIXME comment here just so we don't totally forget. This will > also blow up again when we wrap the entire reset path into a dma_fence > critical section annotation (at least going forward, we can't do that on > hw that needs display reset with the current code unfortunately). > = > But I did look at the code which originally added this in > = > commit 2caffbf1176256cc4f8d4e5c3c524fc689cb9876 > Author: Chris Wilson > Date: Fri Feb 8 15:37:03 2019 +0000 > = > drm/i915: Revoke mmaps and prevent access to fence registers across r= eset > = > and noped right out. > = > I think this complexity needs to go entirely, and instead we just protect > the fence register state to make sure that after reset they are all good > again: > - add a new mutex for low level fence register state > - hold that mutex around fence register writes (really just the low level > fence writes) > - hold it in the reset path when we restore fence registers > = > This means that a global reset also thrashes mmaps, but it's a global > reset we're talking about here, everything is thrash anyway. Plus/minus > fenced gtt mmaps really doesn't change the tally. My recollection is that GPU reset doesn't actually clobber the fence = registers. Though not 100% sure I can trust my brain on this. Also dunno if it actually matter here or not, but figured I'd point it out. -- = Ville Syrj=E4l=E4 Intel _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx