From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <intel-xe-bounces@lists.freedesktop.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.lore.kernel.org (Postfix) with ESMTPS id 4142CC0015E
	for <intel-xe@archiver.kernel.org>; Tue, 15 Aug 2023 23:20:33 +0000 (UTC)
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id CAD6A10E2A4;
	Tue, 15 Aug 2023 23:20:32 +0000 (UTC)
Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126])
 by gabe.freedesktop.org (Postfix) with ESMTPS id 5F60B10E2A4
 for <intel-xe@lists.freedesktop.org>; Tue, 15 Aug 2023 23:20:30 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
 d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
 t=1692141630; x=1723677630;
 h=date:message-id:from:to:cc:subject:in-reply-to:
 references:mime-version;
 bh=0UeDMCBIhYbwtHtWPEGSdXTnhki89eBrNah5gHQFmRk=;
 b=AexmEK6ez5a5AazHAsd4x8N9PpUOgpOvrpRIns8pzIRsAuFKfzZ1FTrk
 IyoT6ZOFkOXWR107UopfDfv5pWQrhlxAW9LZsP1knJUR75uw/GqdbPD1L
 KcA7/Om1qVzGCOIgS6n0a6xtbz2YYA+3sFSx6JB1sPTYSa1E3DKgxh423
 F1pNciZr2NCkVmIrMf59Dd3IePd6uDhWvnei4A4D88QhoBs8ogblfjEJr
 Nc1F4LxKTyJTg/n+aQHX+m1pzZ2QZzBbSR8t2RZZL7zN9hlA9nXxRtw8w
 SqxI+d3GcyLWvrN0KVTFZbagOgvoaSwTDORhAg7D8NFdRP37eRncAkztH Q==;
X-IronPort-AV: E=McAfee;i="6600,9927,10803"; a="357376592"
X-IronPort-AV: E=Sophos;i="6.01,175,1684825200"; d="scan'208";a="357376592"
Received: from orsmga004.jf.intel.com ([10.7.209.38])
 by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 15 Aug 2023 16:20:29 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=McAfee;i="6600,9927,10803"; a="857600690"
X-IronPort-AV: E=Sophos;i="6.01,175,1684825200"; d="scan'208";a="857600690"
Received: from adixit-mobl.amr.corp.intel.com (HELO adixit-arch.intel.com)
 ([10.209.138.252])
 by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 15 Aug 2023 16:20:28 -0700
Date: Tue, 15 Aug 2023 16:20:26 -0700
Message-ID: <87zg2rsxj9.wl-ashutosh.dixit@intel.com>
From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Andi Shyti <andi.shyti@linux.intel.com>
In-Reply-To: <ZJ2Qm0UcAidCEArX@ashyti-mobl2.lan>
References: <20230627183043.2024530-1-badal.nilawar@intel.com>	<20230627183043.2024530-3-badal.nilawar@intel.com>	<ZJzNuq/WaxjZ8YH/@DUT025-TGLU.fm.intel.com>	<ZJ2Qm0UcAidCEArX@ashyti-mobl2.lan>
User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue)
 FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0
 Emacs/29.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)
MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue")
Content-Type: text/plain; charset=US-ASCII
Subject: Re: [Intel-xe] [PATCH v2 2/6] drm/xe/hwmon: Expose power attributes
X-BeenThere: intel-xe@lists.freedesktop.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Intel Xe graphics driver <intel-xe.lists.freedesktop.org>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/intel-xe>,
 <mailto:intel-xe-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/intel-xe>
List-Post: <mailto:intel-xe@lists.freedesktop.org>
List-Help: <mailto:intel-xe-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/intel-xe>,
 <mailto:intel-xe-request@lists.freedesktop.org?subject=subscribe>
Cc: linux-hwmon@vger.kernel.org, intel-xe@lists.freedesktop.org,
 linux@roeck-us.net
Errors-To: intel-xe-bounces@lists.freedesktop.org
Sender: "Intel-xe" <intel-xe-bounces@lists.freedesktop.org>

On Thu, 29 Jun 2023 07:09:31 -0700, Andi Shyti wrote:
>

Hi Badal/Andi/Matt,

> > > +static int hwm_power_max_write(struct hwm_drvdata *ddat, long value)
> > > +{
> > > +	struct xe_hwmon *hwmon = ddat->hwmon;
> > > +	DEFINE_WAIT(wait);
> > > +	int ret = 0;
> > > +	u32 nval;
> > > +
> > > +	/* Block waiting for GuC reset to complete when needed */
> > > +	for (;;) {
>
> with a do...while() you shouldn't need a for(;;)... your choice,
> not going to beat on that.
>
> > > +		mutex_lock(&hwmon->hwmon_lock);
> > > +
> > > +		prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE);
> > > +
> > > +		if (!hwmon->ddat.reset_in_progress)
> > > +			break;
> > > +
> > > +		if (signal_pending(current)) {
> > > +			ret = -EINTR;
> > > +			break;
>
> cough! cough! unlock! cough! cough!

Why? It's fine as is.

>
> > > +		}
> > > +
> > > +		mutex_unlock(&hwmon->hwmon_lock);
> > > +
> > > +		schedule();
> > > +	}
> > > +	finish_wait(&ddat->waitq, &wait);
> > > +	if (ret)
> > > +		goto unlock;
> >
> > Anyway to not open code this? We similar in with
> > xe_guc_submit_reset_wait, could we expose a global reset in progress in
> > function which we can expose at the gt level?

I don't know of any way to not open code this which is guaranteed to not
deadlock (not to say there are no other ways).

> >
> > > +
> > > +	xe_device_mem_access_get(gt_to_xe(ddat->gt));
> > > +
> >
> > This certainly is an outer most thing, e.g. doing this under
> > hwmon->hwmon_lock seems dangerous. Again the upper levels of the stack
> > should do the xe_device_mem_access_get, which it does. Just pointing out
> > doing xe_device_mem_access_get/put under a lock isn't a good idea.

Agree, this is the only change we should make to this code.

> >
> > Also the the loop which acquires hwmon->hwmon_lock is confusing too.

Confusing but correct.

> >
> > > +	/* Disable PL1 limit and verify, as limit cannot be disabled on all platforms */
> > > +	if (value == PL1_DISABLE) {
> > > +		process_hwmon_reg(ddat, pkg_rapl_limit, reg_rmw, &nval,
> > > +				  PKG_PWR_LIM_1_EN, 0);
> > > +		process_hwmon_reg(ddat, pkg_rapl_limit, reg_read, &nval,
> > > +				  PKG_PWR_LIM_1_EN, 0);
> > > +
> > > +		if (nval & PKG_PWR_LIM_1_EN)
> > > +			ret = -ENODEV;
> > > +		goto exit;
>
> cough! cough! lock! cough! cough!

Why? It's fine as is.

>
> > > +	}
> > > +
> > > +	/* Computation in 64-bits to avoid overflow. Round to nearest. */
> > > +	nval = DIV_ROUND_CLOSEST_ULL((u64)value << hwmon->scl_shift_power, SF_POWER);
> > > +	nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval);
> > > +
> > > +	process_hwmon_reg(ddat, pkg_rapl_limit, reg_rmw, &nval,
> > > +			  PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
> > > +exit:
> > > +	xe_device_mem_access_put(gt_to_xe(ddat->gt));
> > > +unlock:
> > > +	mutex_unlock(&hwmon->hwmon_lock);
> > > +
>
> mmhhh???... jokes apart this is so error prone that it will
> deadlock as soon as someone will think of editing this file :)

Why is it error prone and how will it deadlock? In fact this
prepare_to_wait/finish_wait pattern is widely used in the kernel (see
e.g. rpm_suspend) and is one of the few patterns guaranteed to not deadlock
(see also 6.2.5 "Advanced Sleeping" in LDD3 if needed). This is the same
code pattern we also implemented in i915 hwm_power_max_write.

In i915 first a scheme which held a mutex across GuC reset was
proposed. But that was then deemed to be risky and this complex scheme was
then implemented. Just to give some history.

Regarding editing the code, it's kernel code involving locking which needs
to be edited carefully, it's all confined to a single (or maybe a couple of
functions), but otherwise yes definitely not to mess around with :)

>
> It worried me already at the first part.
>
> Please, as Matt said, have a more linear locking here.

Afaiac I don't know of any other race-free way to do this other than what
was done in v2 (and in i915). So I want to discard the changes made in v3
and go back to the changes made in v2. If others can suggest other ways
that which they can guarantee are race-free and correct and can R-b that
code, that's fine.

But I can R-b the v2 code (with the only change of moving
xe_device_mem_access_get out of the lock). (Of course I am only talking
about R-b'ing the above scheme, other review comments may be valid).

Badal, also, if there are questions about this scheme, maybe we should move
this to a separate patch as was done in i915? We can just return -EAGAIN as
in 1b44019a93e2.

Thanks.
--
Ashutosh