From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4142CC0015E for ; Tue, 15 Aug 2023 23:20:33 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CAD6A10E2A4; Tue, 15 Aug 2023 23:20:32 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5F60B10E2A4 for ; Tue, 15 Aug 2023 23:20:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1692141630; x=1723677630; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version; bh=0UeDMCBIhYbwtHtWPEGSdXTnhki89eBrNah5gHQFmRk=; b=AexmEK6ez5a5AazHAsd4x8N9PpUOgpOvrpRIns8pzIRsAuFKfzZ1FTrk IyoT6ZOFkOXWR107UopfDfv5pWQrhlxAW9LZsP1knJUR75uw/GqdbPD1L KcA7/Om1qVzGCOIgS6n0a6xtbz2YYA+3sFSx6JB1sPTYSa1E3DKgxh423 F1pNciZr2NCkVmIrMf59Dd3IePd6uDhWvnei4A4D88QhoBs8ogblfjEJr Nc1F4LxKTyJTg/n+aQHX+m1pzZ2QZzBbSR8t2RZZL7zN9hlA9nXxRtw8w SqxI+d3GcyLWvrN0KVTFZbagOgvoaSwTDORhAg7D8NFdRP37eRncAkztH Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10803"; a="357376592" X-IronPort-AV: E=Sophos;i="6.01,175,1684825200"; d="scan'208";a="357376592" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Aug 2023 16:20:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10803"; a="857600690" X-IronPort-AV: E=Sophos;i="6.01,175,1684825200"; d="scan'208";a="857600690" Received: from adixit-mobl.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.209.138.252]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Aug 2023 16:20:28 -0700 Date: Tue, 15 Aug 2023 16:20:26 -0700 Message-ID: <87zg2rsxj9.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Andi Shyti In-Reply-To: References: <20230627183043.2024530-1-badal.nilawar@intel.com> <20230627183043.2024530-3-badal.nilawar@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Subject: Re: [Intel-xe] [PATCH v2 2/6] drm/xe/hwmon: Expose power attributes X-BeenThere: intel-xe@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Xe graphics driver List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-hwmon@vger.kernel.org, intel-xe@lists.freedesktop.org, linux@roeck-us.net Errors-To: intel-xe-bounces@lists.freedesktop.org Sender: "Intel-xe" On Thu, 29 Jun 2023 07:09:31 -0700, Andi Shyti wrote: > Hi Badal/Andi/Matt, > > > +static int hwm_power_max_write(struct hwm_drvdata *ddat, long value) > > > +{ > > > + struct xe_hwmon *hwmon = ddat->hwmon; > > > + DEFINE_WAIT(wait); > > > + int ret = 0; > > > + u32 nval; > > > + > > > + /* Block waiting for GuC reset to complete when needed */ > > > + for (;;) { > > with a do...while() you shouldn't need a for(;;)... your choice, > not going to beat on that. > > > > + mutex_lock(&hwmon->hwmon_lock); > > > + > > > + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); > > > + > > > + if (!hwmon->ddat.reset_in_progress) > > > + break; > > > + > > > + if (signal_pending(current)) { > > > + ret = -EINTR; > > > + break; > > cough! cough! unlock! cough! cough! Why? It's fine as is. > > > > + } > > > + > > > + mutex_unlock(&hwmon->hwmon_lock); > > > + > > > + schedule(); > > > + } > > > + finish_wait(&ddat->waitq, &wait); > > > + if (ret) > > > + goto unlock; > > > > Anyway to not open code this? We similar in with > > xe_guc_submit_reset_wait, could we expose a global reset in progress in > > function which we can expose at the gt level? I don't know of any way to not open code this which is guaranteed to not deadlock (not to say there are no other ways). > > > > > + > > > + xe_device_mem_access_get(gt_to_xe(ddat->gt)); > > > + > > > > This certainly is an outer most thing, e.g. doing this under > > hwmon->hwmon_lock seems dangerous. Again the upper levels of the stack > > should do the xe_device_mem_access_get, which it does. Just pointing out > > doing xe_device_mem_access_get/put under a lock isn't a good idea. Agree, this is the only change we should make to this code. > > > > Also the the loop which acquires hwmon->hwmon_lock is confusing too. Confusing but correct. > > > > > + /* Disable PL1 limit and verify, as limit cannot be disabled on all platforms */ > > > + if (value == PL1_DISABLE) { > > > + process_hwmon_reg(ddat, pkg_rapl_limit, reg_rmw, &nval, > > > + PKG_PWR_LIM_1_EN, 0); > > > + process_hwmon_reg(ddat, pkg_rapl_limit, reg_read, &nval, > > > + PKG_PWR_LIM_1_EN, 0); > > > + > > > + if (nval & PKG_PWR_LIM_1_EN) > > > + ret = -ENODEV; > > > + goto exit; > > cough! cough! lock! cough! cough! Why? It's fine as is. > > > > + } > > > + > > > + /* Computation in 64-bits to avoid overflow. Round to nearest. */ > > > + nval = DIV_ROUND_CLOSEST_ULL((u64)value << hwmon->scl_shift_power, SF_POWER); > > > + nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval); > > > + > > > + process_hwmon_reg(ddat, pkg_rapl_limit, reg_rmw, &nval, > > > + PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval); > > > +exit: > > > + xe_device_mem_access_put(gt_to_xe(ddat->gt)); > > > +unlock: > > > + mutex_unlock(&hwmon->hwmon_lock); > > > + > > mmhhh???... jokes apart this is so error prone that it will > deadlock as soon as someone will think of editing this file :) Why is it error prone and how will it deadlock? In fact this prepare_to_wait/finish_wait pattern is widely used in the kernel (see e.g. rpm_suspend) and is one of the few patterns guaranteed to not deadlock (see also 6.2.5 "Advanced Sleeping" in LDD3 if needed). This is the same code pattern we also implemented in i915 hwm_power_max_write. In i915 first a scheme which held a mutex across GuC reset was proposed. But that was then deemed to be risky and this complex scheme was then implemented. Just to give some history. Regarding editing the code, it's kernel code involving locking which needs to be edited carefully, it's all confined to a single (or maybe a couple of functions), but otherwise yes definitely not to mess around with :) > > It worried me already at the first part. > > Please, as Matt said, have a more linear locking here. Afaiac I don't know of any other race-free way to do this other than what was done in v2 (and in i915). So I want to discard the changes made in v3 and go back to the changes made in v2. If others can suggest other ways that which they can guarantee are race-free and correct and can R-b that code, that's fine. But I can R-b the v2 code (with the only change of moving xe_device_mem_access_get out of the lock). (Of course I am only talking about R-b'ing the above scheme, other review comments may be valid). Badal, also, if there are questions about this scheme, maybe we should move this to a separate patch as was done in i915? We can just return -EAGAIN as in 1b44019a93e2. Thanks. -- Ashutosh