Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Nilawar, Badal" <badal.nilawar@intel.com>
To: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>,
	Andi Shyti <andi.shyti@linux.intel.com>
Cc: linux-hwmon@vger.kernel.org, linux@roeck-us.net,
	intel-xe@lists.freedesktop.org
Subject: Re: [Intel-xe] [PATCH v2 2/6] drm/xe/hwmon: Expose power attributes
Date: Fri, 18 Aug 2023 09:33:19 +0530	[thread overview]
Message-ID: <37e66486-e1e6-318d-f402-b83231a7aacd@intel.com> (raw)
In-Reply-To: <87zg2rsxj9.wl-ashutosh.dixit@intel.com>



On 16-08-2023 04:50, Dixit, Ashutosh wrote:
> On Thu, 29 Jun 2023 07:09:31 -0700, Andi Shyti wrote:
>>
> 
> Hi Badal/Andi/Matt,
> 
>>>> +static int hwm_power_max_write(struct hwm_drvdata *ddat, long value)
>>>> +{
>>>> +	struct xe_hwmon *hwmon = ddat->hwmon;
>>>> +	DEFINE_WAIT(wait);
>>>> +	int ret = 0;
>>>> +	u32 nval;
>>>> +
>>>> +	/* Block waiting for GuC reset to complete when needed */
>>>> +	for (;;) {
>>
>> with a do...while() you shouldn't need a for(;;)... your choice,
>> not going to beat on that.
>>
>>>> +		mutex_lock(&hwmon->hwmon_lock);
>>>> +
>>>> +		prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE);
>>>> +
>>>> +		if (!hwmon->ddat.reset_in_progress)
>>>> +			break;
>>>> +
>>>> +		if (signal_pending(current)) {
>>>> +			ret = -EINTR;
>>>> +			break;
>>
>> cough! cough! unlock! cough! cough!
> 
> Why? It's fine as is.
> 
>>
>>>> +		}
>>>> +
>>>> +		mutex_unlock(&hwmon->hwmon_lock);
>>>> +
>>>> +		schedule();
>>>> +	}
>>>> +	finish_wait(&ddat->waitq, &wait);
>>>> +	if (ret)
>>>> +		goto unlock;
>>>
>>> Anyway to not open code this? We similar in with
>>> xe_guc_submit_reset_wait, could we expose a global reset in progress in
>>> function which we can expose at the gt level?
> 
> I don't know of any way to not open code this which is guaranteed to not
> deadlock (not to say there are no other ways).
> 
>>>
>>>> +
>>>> +	xe_device_mem_access_get(gt_to_xe(ddat->gt));
>>>> +
>>>
>>> This certainly is an outer most thing, e.g. doing this under
>>> hwmon->hwmon_lock seems dangerous. Again the upper levels of the stack
>>> should do the xe_device_mem_access_get, which it does. Just pointing out
>>> doing xe_device_mem_access_get/put under a lock isn't a good idea.
> 
> Agree, this is the only change we should make to this code.
> 
>>>
>>> Also the the loop which acquires hwmon->hwmon_lock is confusing too.
> 
> Confusing but correct.
> 
>>>
>>>> +	/* Disable PL1 limit and verify, as limit cannot be disabled on all platforms */
>>>> +	if (value == PL1_DISABLE) {
>>>> +		process_hwmon_reg(ddat, pkg_rapl_limit, reg_rmw, &nval,
>>>> +				  PKG_PWR_LIM_1_EN, 0);
>>>> +		process_hwmon_reg(ddat, pkg_rapl_limit, reg_read, &nval,
>>>> +				  PKG_PWR_LIM_1_EN, 0);
>>>> +
>>>> +		if (nval & PKG_PWR_LIM_1_EN)
>>>> +			ret = -ENODEV;
>>>> +		goto exit;
>>
>> cough! cough! lock! cough! cough!
> 
> Why? It's fine as is.
> 
>>
>>>> +	}
>>>> +
>>>> +	/* Computation in 64-bits to avoid overflow. Round to nearest. */
>>>> +	nval = DIV_ROUND_CLOSEST_ULL((u64)value << hwmon->scl_shift_power, SF_POWER);
>>>> +	nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval);
>>>> +
>>>> +	process_hwmon_reg(ddat, pkg_rapl_limit, reg_rmw, &nval,
>>>> +			  PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
>>>> +exit:
>>>> +	xe_device_mem_access_put(gt_to_xe(ddat->gt));
>>>> +unlock:
>>>> +	mutex_unlock(&hwmon->hwmon_lock);
>>>> +
>>
>> mmhhh???... jokes apart this is so error prone that it will
>> deadlock as soon as someone will think of editing this file :)
> 
> Why is it error prone and how will it deadlock? In fact this
> prepare_to_wait/finish_wait pattern is widely used in the kernel (see
> e.g. rpm_suspend) and is one of the few patterns guaranteed to not deadlock
> (see also 6.2.5 "Advanced Sleeping" in LDD3 if needed). This is the same
> code pattern we also implemented in i915 hwm_power_max_write.
> 
> In i915 first a scheme which held a mutex across GuC reset was
> proposed. But that was then deemed to be risky and this complex scheme was
> then implemented. Just to give some history.
> 
> Regarding editing the code, it's kernel code involving locking which needs
> to be edited carefully, it's all confined to a single (or maybe a couple of
> functions), but otherwise yes definitely not to mess around with :)
> 
>>
>> It worried me already at the first part.
>>
>> Please, as Matt said, have a more linear locking here.
> 
> Afaiac I don't know of any other race-free way to do this other than what
> was done in v2 (and in i915). So I want to discard the changes made in v3
> and go back to the changes made in v2. If others can suggest other ways
> that which they can guarantee are race-free and correct and can R-b that
> code, that's fine.
> 
> But I can R-b the v2 code (with the only change of moving
> xe_device_mem_access_get out of the lock). (Of course I am only talking
> about R-b'ing the above scheme, other review comments may be valid).
> 
> Badal, also, if there are questions about this scheme, maybe we should move
> this to a separate patch as was done in i915? We can just return -EAGAIN as
> in 1b44019a93e2.

Thanks Ashutosh. I think for now I will drop changes for "PL1 disable 
during GuC load" from this series and will handle it in separate patch.

Regards,
Badal
> Thanks.
> --
> Ashutosh

  reply	other threads:[~2023-08-18  4:03 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-27 18:30 [Intel-xe] [PATCH v2 0/6] Add HWMON support for DGFX Badal Nilawar
2023-06-27 18:27 ` [Intel-xe] ✓ CI.Patch_applied: success for Add HWMON support for DGFX (rev2) Patchwork
2023-06-27 18:27 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork
2023-06-27 18:29 ` [Intel-xe] ✓ CI.KUnit: success " Patchwork
2023-06-27 18:30 ` [Intel-xe] [PATCH v2 1/6] drm/xe/hwmon: Add HWMON infrastructure Badal Nilawar
2023-06-28 22:50   ` Matthew Brost
2023-07-05 18:30     ` Nilawar, Badal
2023-06-29 13:49   ` Andi Shyti
2023-07-07 14:23     ` Nilawar, Badal
2023-06-27 18:30 ` [Intel-xe] [PATCH v2 2/6] drm/xe/hwmon: Expose power attributes Badal Nilawar
2023-06-29  0:18   ` Matthew Brost
2023-06-29 14:09     ` Andi Shyti
2023-08-15 23:20       ` Dixit, Ashutosh
2023-08-18  4:03         ` Nilawar, Badal [this message]
2023-08-18 13:55         ` Andi Shyti
2023-07-06 10:36     ` Nilawar, Badal
2023-06-27 18:30 ` [Intel-xe] [PATCH v2 3/6] drm/xe/hwmon: Expose card reactive critical power Badal Nilawar
2023-06-28 15:52   ` Nilawar, Badal
2023-06-29 14:40   ` Andi Shyti
2023-07-06 19:05     ` Dixit, Ashutosh
2023-06-27 18:30 ` [Intel-xe] [PATCH v2 4/6] drm/xe/hwmon: Expose input voltage attribute Badal Nilawar
2023-06-29 14:58   ` Andi Shyti
2023-06-27 18:30 ` [Intel-xe] [PATCH v2 5/6] drm/xe/hwmon: Expose hwmon energy attribute Badal Nilawar
2023-06-29 15:09   ` Andi Shyti
2023-06-27 18:30 ` [Intel-xe] [PATCH v2 6/6] drm/xe/hwmon: Expose power1_max_interval Badal Nilawar
2023-06-27 18:32 ` [Intel-xe] ✓ CI.Build: success for Add HWMON support for DGFX (rev2) Patchwork
2023-06-27 18:33 ` [Intel-xe] ✗ CI.Hooks: failure " Patchwork
2023-07-02  1:31 ` [Intel-xe] [PATCH v2 0/6] Add HWMON support for DGFX Dixit, Ashutosh
2023-07-02  3:02   ` Guenter Roeck
2023-07-02 15:57     ` Dixit, Ashutosh
2023-07-02 17:01       ` Guenter Roeck
2023-07-02 20:29         ` Dixit, Ashutosh
2023-07-02 20:51           ` Guenter Roeck
2023-07-03  1:48             ` Dixit, Ashutosh
2023-07-03  2:37               ` Guenter Roeck
2023-07-14 20:21         ` Rodrigo Vivi
2023-07-14 22:26           ` Guenter Roeck
2023-07-19 17:01             ` Rodrigo Vivi
2023-07-03  8:55     ` Andi Shyti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37e66486-e1e6-318d-f402-b83231a7aacd@intel.com \
    --to=badal.nilawar@intel.com \
    --cc=andi.shyti@linux.intel.com \
    --cc=ashutosh.dixit@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=linux-hwmon@vger.kernel.org \
    --cc=linux@roeck-us.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox