From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E536BC0015E for ; Tue, 15 Aug 2023 23:21:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234230AbjHOXUf (ORCPT ); Tue, 15 Aug 2023 19:20:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240656AbjHOXUa (ORCPT ); Tue, 15 Aug 2023 19:20:30 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6EC3E7A for ; Tue, 15 Aug 2023 16:20:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1692141629; x=1723677629; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=0UeDMCBIhYbwtHtWPEGSdXTnhki89eBrNah5gHQFmRk=; b=TQGaijCch6BCcCq5Nv74nV0Rurt1M+hpxVWQAPMPVgFu7AnnLc7QgKCM tWce46qFd7yYKE3A+I0rzZp9f49G0XUFYxuixCIx0/+2YFF5XU3227+0t Xxzf8t3KXdg7BcJHpBbhd7IjfrplJc/r9+uOnWCy2OK9O4UP0K5lDItwH IyV9SsmlvFOLAe6wX40hV1po0OXj2NciVQVJNJSqMhfaUaZ8Q0kZO6+Jj 4x3vGwfLuWdzLlveD1MzzKkk5vrStrYI7yyZ/CvCvTmTlUlrFqZLkP/dC oxp/mC9mW3HddycqgBzNvRpMjzqhHuWwYbQImRhnqLNegmp0HF2xKUvJf w==; X-IronPort-AV: E=McAfee;i="6600,9927,10803"; a="357376591" X-IronPort-AV: E=Sophos;i="6.01,175,1684825200"; d="scan'208";a="357376591" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Aug 2023 16:20:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10803"; a="857600690" X-IronPort-AV: E=Sophos;i="6.01,175,1684825200"; d="scan'208";a="857600690" Received: from adixit-mobl.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.209.138.252]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Aug 2023 16:20:28 -0700 Date: Tue, 15 Aug 2023 16:20:26 -0700 Message-ID: <87zg2rsxj9.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Andi Shyti Cc: Matthew Brost , Badal Nilawar , , , , , Subject: Re: [PATCH v2 2/6] drm/xe/hwmon: Expose power attributes In-Reply-To: References: <20230627183043.2024530-1-badal.nilawar@intel.com> <20230627183043.2024530-3-badal.nilawar@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Precedence: bulk List-ID: X-Mailing-List: linux-hwmon@vger.kernel.org On Thu, 29 Jun 2023 07:09:31 -0700, Andi Shyti wrote: > Hi Badal/Andi/Matt, > > > +static int hwm_power_max_write(struct hwm_drvdata *ddat, long value) > > > +{ > > > + struct xe_hwmon *hwmon = ddat->hwmon; > > > + DEFINE_WAIT(wait); > > > + int ret = 0; > > > + u32 nval; > > > + > > > + /* Block waiting for GuC reset to complete when needed */ > > > + for (;;) { > > with a do...while() you shouldn't need a for(;;)... your choice, > not going to beat on that. > > > > + mutex_lock(&hwmon->hwmon_lock); > > > + > > > + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); > > > + > > > + if (!hwmon->ddat.reset_in_progress) > > > + break; > > > + > > > + if (signal_pending(current)) { > > > + ret = -EINTR; > > > + break; > > cough! cough! unlock! cough! cough! Why? It's fine as is. > > > > + } > > > + > > > + mutex_unlock(&hwmon->hwmon_lock); > > > + > > > + schedule(); > > > + } > > > + finish_wait(&ddat->waitq, &wait); > > > + if (ret) > > > + goto unlock; > > > > Anyway to not open code this? We similar in with > > xe_guc_submit_reset_wait, could we expose a global reset in progress in > > function which we can expose at the gt level? I don't know of any way to not open code this which is guaranteed to not deadlock (not to say there are no other ways). > > > > > + > > > + xe_device_mem_access_get(gt_to_xe(ddat->gt)); > > > + > > > > This certainly is an outer most thing, e.g. doing this under > > hwmon->hwmon_lock seems dangerous. Again the upper levels of the stack > > should do the xe_device_mem_access_get, which it does. Just pointing out > > doing xe_device_mem_access_get/put under a lock isn't a good idea. Agree, this is the only change we should make to this code. > > > > Also the the loop which acquires hwmon->hwmon_lock is confusing too. Confusing but correct. > > > > > + /* Disable PL1 limit and verify, as limit cannot be disabled on all platforms */ > > > + if (value == PL1_DISABLE) { > > > + process_hwmon_reg(ddat, pkg_rapl_limit, reg_rmw, &nval, > > > + PKG_PWR_LIM_1_EN, 0); > > > + process_hwmon_reg(ddat, pkg_rapl_limit, reg_read, &nval, > > > + PKG_PWR_LIM_1_EN, 0); > > > + > > > + if (nval & PKG_PWR_LIM_1_EN) > > > + ret = -ENODEV; > > > + goto exit; > > cough! cough! lock! cough! cough! Why? It's fine as is. > > > > + } > > > + > > > + /* Computation in 64-bits to avoid overflow. Round to nearest. */ > > > + nval = DIV_ROUND_CLOSEST_ULL((u64)value << hwmon->scl_shift_power, SF_POWER); > > > + nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval); > > > + > > > + process_hwmon_reg(ddat, pkg_rapl_limit, reg_rmw, &nval, > > > + PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval); > > > +exit: > > > + xe_device_mem_access_put(gt_to_xe(ddat->gt)); > > > +unlock: > > > + mutex_unlock(&hwmon->hwmon_lock); > > > + > > mmhhh???... jokes apart this is so error prone that it will > deadlock as soon as someone will think of editing this file :) Why is it error prone and how will it deadlock? In fact this prepare_to_wait/finish_wait pattern is widely used in the kernel (see e.g. rpm_suspend) and is one of the few patterns guaranteed to not deadlock (see also 6.2.5 "Advanced Sleeping" in LDD3 if needed). This is the same code pattern we also implemented in i915 hwm_power_max_write. In i915 first a scheme which held a mutex across GuC reset was proposed. But that was then deemed to be risky and this complex scheme was then implemented. Just to give some history. Regarding editing the code, it's kernel code involving locking which needs to be edited carefully, it's all confined to a single (or maybe a couple of functions), but otherwise yes definitely not to mess around with :) > > It worried me already at the first part. > > Please, as Matt said, have a more linear locking here. Afaiac I don't know of any other race-free way to do this other than what was done in v2 (and in i915). So I want to discard the changes made in v3 and go back to the changes made in v2. If others can suggest other ways that which they can guarantee are race-free and correct and can R-b that code, that's fine. But I can R-b the v2 code (with the only change of moving xe_device_mem_access_get out of the lock). (Of course I am only talking about R-b'ing the above scheme, other review comments may be valid). Badal, also, if there are questions about this scheme, maybe we should move this to a separate patch as was done in i915? We can just return -EAGAIN as in 1b44019a93e2. Thanks. -- Ashutosh