From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F3D3CCCA479 for ; Mon, 18 Jul 2022 16:06:40 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E04F112A9B9; Mon, 18 Jul 2022 16:06:38 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3513A112E06; Mon, 18 Jul 2022 16:06:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1658160397; x=1689696397; h=date:from:to:cc:subject:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BGMaJTElQp6haGmkH8rT8RsSQevkIPle8322hFlFuJg=; b=OEPZJgO5+2GDbJQGXu5mqlZ4TRhmozuZGuDTWxDvbw4y9RNtJQCFQAfn 5JGq50EjDaVqGaiq2GMSVx362YcCfFT8fuyXBmwdUdMN2RYn8W0KodrJW 9hn8CRN0TEKvN1ETRVy/5BQNMNSRoorU4oHdYLlU0+zmDBRS7h/eXYHDl Sxy2xBRTKhP4dPwXQtMTniwgbpWXf1L+ofPSVEQCaj4Bak+ExHzhqMVXE RvbjdqL7YQ3byeHLM5Kj6D7h5cOk20EJEYiRgK++uDkgy7CybxZaZ0i4A vxpGCkKrN9FOWEFyZLKbES6YJiYkSWwEDGLzuV604E+OqERIA63q4ld+t g==; X-IronPort-AV: E=McAfee;i="6400,9594,10412"; a="266664159" X-IronPort-AV: E=Sophos;i="5.92,281,1650956400"; d="scan'208";a="266664159" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2022 09:06:36 -0700 X-IronPort-AV: E=Sophos;i="5.92,281,1650956400"; d="scan'208";a="655347466" Received: from maurocar-mobl2.ger.corp.intel.com (HELO maurocar-mobl2) ([10.249.35.85]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2022 09:06:33 -0700 Date: Mon, 18 Jul 2022 18:06:30 +0200 From: Mauro Carvalho Chehab To: Tvrtko Ursulin Message-ID: <20220718180630.7bef2fd9@maurocar-mobl2> In-Reply-To: References: X-Mailer: Claws Mail 4.1.0 (GTK 3.24.34; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Intel-gfx] [PATCH v2 05/21] drm/i915/gt: Skip TLB invalidations once wedged X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Thomas =?UTF-8?B?SGVsbHN0?= =?UTF-8?B?csO2bQ==?= , David Airlie , intel-gfx@lists.freedesktop.org, Lucas De Marchi , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Chris Wilson , Rodrigo Vivi , Dave Airlie , stable@vger.kernel.org, Mauro Carvalho Chehab Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Mon, 18 Jul 2022 14:45:22 +0100 Tvrtko Ursulin wrote: > On 14/07/2022 13:06, Mauro Carvalho Chehab wrote: > > From: Chris Wilson > > > > Skip all further TLB invalidations once the device is wedged and > > had been reset, as, on such cases, it can no longer process instructions > > on the GPU and the user no longer has access to the TLB's in each engine. > > > > That helps to reduce the performance regression introduced by TLB > > invalidate logic. > > > > Cc: stable@vger.kernel.org > > Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store") > > Is the claim of a performance regression this solved based on a wedged > GPU which does not work any more to the extend where mmio tlb > invalidation requests keep timing out? If so please clarify in the > commit text and then it looks good to me. Even if it is IMO a very > borderline situation to declare something a fix. Indeed this helps on a borderline situation: if GT is wedged, TLB invalidation will timeout, so it makes sense to keep the patch with a comment like: drm/i915/gt: Skip TLB invalidations once wedged Skip all further TLB invalidations once the device is wedged and had been reset, as, on such cases, it can no longer process instructions on the GPU and the user no longer has access to the TLB's in each engine. So, an attempt to do a TLB cache invalidation will produce a timeout. That helps to reduce the performance regression introduced by TLB invalidate logic. Regards, Mauro