From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DC722C761AF for ; Fri, 31 Mar 2023 02:28:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AC21510E2C9; Fri, 31 Mar 2023 02:28:42 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 58FD210E056; Fri, 31 Mar 2023 02:28:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680229720; x=1711765720; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=7L/ONAdWRF6C5MWa2SMOV5bgRQ1CwJucSoWbs/Qb64M=; b=n3LkN7rMPmASFFPRGQBIt8EYgv4FOQ8mrvkADdr/m2Wb0nDFP5C7vhur 3Ivu19rgqL5TJArEqBEgVnZlB4qNv8AEzpK5QT2tS0tJ3fYJbUfrwwaiW 28to0PKOlYNDlF5Zcg4RLewtCbqs/H27u+t5rBGg93n/Lt0dZBhhH/Kxj Qz2EDrn7UYPudlfNnW/SuQaPJb/eJ9DDb9r/Uj/U4YmlI5PcTxPQsL2H3 uxPUAlBiN0ODN4yTIi2Fwu1KC1YjFcVen5t4h/4cpzs0plUFxfOXgLpHr ypoCA8+XiSyT0ScALmzC0eBaosSGlImMKZphmRoV4Pm3kiWAr9yXe8OnH w==; X-IronPort-AV: E=McAfee;i="6600,9927,10665"; a="321721195" X-IronPort-AV: E=Sophos;i="5.98,306,1673942400"; d="scan'208";a="321721195" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 19:28:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10665"; a="715250114" X-IronPort-AV: E=Sophos;i="5.98,306,1673942400"; d="scan'208";a="715250114" Received: from adixit-mobl.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.209.16.72]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Mar 2023 19:28:39 -0700 Date: Thu, 30 Mar 2023 19:17:49 -0700 Message-ID: <87o7o9d5pu.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Rodrigo Vivi In-Reply-To: References: <20230328233543.1091127-1-ashutosh.dixit@intel.com> <87cz4qlre6.wl-ashutosh.dixit@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/28.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Subject: Re: [Intel-gfx] [PATCH] drm/i915/hwmon: Use 0 to designate disabled PL1 power limit X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Thu, 30 Mar 2023 08:44:34 -0700, Rodrigo Vivi wrote: > > On Wed, Mar 29, 2023 at 10:50:09PM -0700, Dixit, Ashutosh wrote: > > On Tue, 28 Mar 2023 16:35:43 -0700, Ashutosh Dixit wrote: > > > > > > On ATSM the PL1 limit is disabled at power up. The previous uapi assumed > > > that the PL1 limit is always enabled and therefore did not have a notion of > > > a disabled PL1 limit. This results in erroneous PL1 limit values when the > > > PL1 limit is disabled. For example at power up, the disabled ATSM PL1 limit > > > was previously shown as 0 which means a low PL1 limit whereas the limit > > > being disabled actually implies a high effective PL1 limit value. > > > > > > To get round this problem, the PL1 limit uapi is expanded to include a > > > special value 0 to designate a disabled PL1 limit. > > > > This patch is another attempt to show when the PL1 power limit is disabled > > and to disable it when it needs to. Previous abandoned attempts to do this > > are [1] and [2]. > > > > The preferred way to do this was [2] but that was NAK'd by hwmon folks (see > > [2]). That is why here we fall back on the approach in [1]. > > I still don't get it, but let's move on... > > > > > This patch is identical to [1] except that the value used to disable the > > PL1 limit has been changed to 0 (from -1 in [1]) as was suggested in [2] > > (both -1 and 0 seem ok for the purpose). > > > > > Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/8062 > > > Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/8060 > > > > The link between this patch and these pretty serious bugs might not be > > immediately clear so here's an explanation: > > > > * Because on ATSM the PL1 power limit is disabled on power up and there > > were no means to enable it, in 6fd3d8bf89fc we implemented the means to > > enable the limit when the PL1 hwmon entry (power1_max) was written to. > > > > * Now there is an IGT igt@i915_hwmon@hwmon_write which (a) reads orig value > > from all hwmon sysfs (b) does a bunch of random writes and finally (c) > > restores the orig value read. On ATSM since the orig value was 0, when > > the IGT restores the 0 value, the PL1 limit is now enabled with a value > > of 0. > > > > * PL1 limit of 0 implies a low PL1 limit which causes GPU freq to fall to > > 100 MHz. This causes GuC FW load and several IGT's to start timing out > > and gives rise the above (and even more) bugs about GuC FW load timing > > out. > > I believe these 3 bullets are key information that deserves to be in > the commit message itself. Done in v2. > > With that there, > > Reviewed-by: Rodrigo Vivi Thanks. -- Ashutosh > > > > > > * After this patch, writing 0 would disable the PL1 limit instead of > > enabling it, avoiding the freq drop issue above, and resolving this Intel > > CI issue. > > > > Thanks. > > -- > > Ashutosh > > > > [1] https://patchwork.freedesktop.org/patch/522612/?series=113972&rev=1 > > [2] https://patchwork.freedesktop.org/patch/522652/?series=113984&rev=1