From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>,
Badal Nilawar <badal.nilawar@intel.com>,
<intel-xe@lists.freedesktop.org>, <linux-hwmon@vger.kernel.org>,
<anshuman.gupta@intel.com>, <linux@roeck-us.net>,
<riana.tauro@intel.com>
Subject: Re: [PATCH v2 2/6] drm/xe/hwmon: Expose power attributes
Date: Tue, 15 Aug 2023 16:20:26 -0700 [thread overview]
Message-ID: <87zg2rsxj9.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <ZJ2Qm0UcAidCEArX@ashyti-mobl2.lan>
On Thu, 29 Jun 2023 07:09:31 -0700, Andi Shyti wrote:
>
Hi Badal/Andi/Matt,
> > > +static int hwm_power_max_write(struct hwm_drvdata *ddat, long value)
> > > +{
> > > + struct xe_hwmon *hwmon = ddat->hwmon;
> > > + DEFINE_WAIT(wait);
> > > + int ret = 0;
> > > + u32 nval;
> > > +
> > > + /* Block waiting for GuC reset to complete when needed */
> > > + for (;;) {
>
> with a do...while() you shouldn't need a for(;;)... your choice,
> not going to beat on that.
>
> > > + mutex_lock(&hwmon->hwmon_lock);
> > > +
> > > + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE);
> > > +
> > > + if (!hwmon->ddat.reset_in_progress)
> > > + break;
> > > +
> > > + if (signal_pending(current)) {
> > > + ret = -EINTR;
> > > + break;
>
> cough! cough! unlock! cough! cough!
Why? It's fine as is.
>
> > > + }
> > > +
> > > + mutex_unlock(&hwmon->hwmon_lock);
> > > +
> > > + schedule();
> > > + }
> > > + finish_wait(&ddat->waitq, &wait);
> > > + if (ret)
> > > + goto unlock;
> >
> > Anyway to not open code this? We similar in with
> > xe_guc_submit_reset_wait, could we expose a global reset in progress in
> > function which we can expose at the gt level?
I don't know of any way to not open code this which is guaranteed to not
deadlock (not to say there are no other ways).
> >
> > > +
> > > + xe_device_mem_access_get(gt_to_xe(ddat->gt));
> > > +
> >
> > This certainly is an outer most thing, e.g. doing this under
> > hwmon->hwmon_lock seems dangerous. Again the upper levels of the stack
> > should do the xe_device_mem_access_get, which it does. Just pointing out
> > doing xe_device_mem_access_get/put under a lock isn't a good idea.
Agree, this is the only change we should make to this code.
> >
> > Also the the loop which acquires hwmon->hwmon_lock is confusing too.
Confusing but correct.
> >
> > > + /* Disable PL1 limit and verify, as limit cannot be disabled on all platforms */
> > > + if (value == PL1_DISABLE) {
> > > + process_hwmon_reg(ddat, pkg_rapl_limit, reg_rmw, &nval,
> > > + PKG_PWR_LIM_1_EN, 0);
> > > + process_hwmon_reg(ddat, pkg_rapl_limit, reg_read, &nval,
> > > + PKG_PWR_LIM_1_EN, 0);
> > > +
> > > + if (nval & PKG_PWR_LIM_1_EN)
> > > + ret = -ENODEV;
> > > + goto exit;
>
> cough! cough! lock! cough! cough!
Why? It's fine as is.
>
> > > + }
> > > +
> > > + /* Computation in 64-bits to avoid overflow. Round to nearest. */
> > > + nval = DIV_ROUND_CLOSEST_ULL((u64)value << hwmon->scl_shift_power, SF_POWER);
> > > + nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval);
> > > +
> > > + process_hwmon_reg(ddat, pkg_rapl_limit, reg_rmw, &nval,
> > > + PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval);
> > > +exit:
> > > + xe_device_mem_access_put(gt_to_xe(ddat->gt));
> > > +unlock:
> > > + mutex_unlock(&hwmon->hwmon_lock);
> > > +
>
> mmhhh???... jokes apart this is so error prone that it will
> deadlock as soon as someone will think of editing this file :)
Why is it error prone and how will it deadlock? In fact this
prepare_to_wait/finish_wait pattern is widely used in the kernel (see
e.g. rpm_suspend) and is one of the few patterns guaranteed to not deadlock
(see also 6.2.5 "Advanced Sleeping" in LDD3 if needed). This is the same
code pattern we also implemented in i915 hwm_power_max_write.
In i915 first a scheme which held a mutex across GuC reset was
proposed. But that was then deemed to be risky and this complex scheme was
then implemented. Just to give some history.
Regarding editing the code, it's kernel code involving locking which needs
to be edited carefully, it's all confined to a single (or maybe a couple of
functions), but otherwise yes definitely not to mess around with :)
>
> It worried me already at the first part.
>
> Please, as Matt said, have a more linear locking here.
Afaiac I don't know of any other race-free way to do this other than what
was done in v2 (and in i915). So I want to discard the changes made in v3
and go back to the changes made in v2. If others can suggest other ways
that which they can guarantee are race-free and correct and can R-b that
code, that's fine.
But I can R-b the v2 code (with the only change of moving
xe_device_mem_access_get out of the lock). (Of course I am only talking
about R-b'ing the above scheme, other review comments may be valid).
Badal, also, if there are questions about this scheme, maybe we should move
this to a separate patch as was done in i915? We can just return -EAGAIN as
in 1b44019a93e2.
Thanks.
--
Ashutosh
next prev parent reply other threads:[~2023-08-15 23:21 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-27 18:30 [PATCH v2 0/6] Add HWMON support for DGFX Badal Nilawar
2023-06-27 18:30 ` [PATCH v2 1/6] drm/xe/hwmon: Add HWMON infrastructure Badal Nilawar
2023-06-28 22:50 ` Matthew Brost
2023-07-05 18:30 ` Nilawar, Badal
2023-06-29 13:49 ` Andi Shyti
2023-07-07 14:23 ` Nilawar, Badal
2023-06-27 18:30 ` [PATCH v2 2/6] drm/xe/hwmon: Expose power attributes Badal Nilawar
2023-06-29 0:18 ` Matthew Brost
2023-06-29 14:09 ` Andi Shyti
2023-08-15 23:20 ` Dixit, Ashutosh [this message]
2023-08-18 4:03 ` Nilawar, Badal
2023-08-18 13:55 ` Andi Shyti
2023-07-06 10:36 ` Nilawar, Badal
2023-06-27 18:30 ` [PATCH v2 3/6] drm/xe/hwmon: Expose card reactive critical power Badal Nilawar
2023-06-29 14:40 ` Andi Shyti
2023-07-06 19:05 ` Dixit, Ashutosh
2023-06-27 18:30 ` [PATCH v2 4/6] drm/xe/hwmon: Expose input voltage attribute Badal Nilawar
2023-06-29 14:58 ` Andi Shyti
2023-06-27 18:30 ` [PATCH v2 5/6] drm/xe/hwmon: Expose hwmon energy attribute Badal Nilawar
2023-06-29 15:09 ` Andi Shyti
2023-06-27 18:30 ` [PATCH v2 6/6] drm/xe/hwmon: Expose power1_max_interval Badal Nilawar
2023-07-02 1:31 ` [PATCH v2 0/6] Add HWMON support for DGFX Dixit, Ashutosh
2023-07-02 3:02 ` Guenter Roeck
2023-07-02 15:57 ` Dixit, Ashutosh
2023-07-02 17:01 ` Guenter Roeck
2023-07-02 20:29 ` Dixit, Ashutosh
2023-07-02 20:51 ` Guenter Roeck
2023-07-03 1:48 ` Dixit, Ashutosh
2023-07-03 2:37 ` Guenter Roeck
2023-07-14 20:21 ` [Intel-xe] " Rodrigo Vivi
2023-07-14 22:26 ` Guenter Roeck
2023-07-19 17:01 ` Rodrigo Vivi
2023-07-03 8:55 ` Andi Shyti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zg2rsxj9.wl-ashutosh.dixit@intel.com \
--to=ashutosh.dixit@intel.com \
--cc=andi.shyti@linux.intel.com \
--cc=anshuman.gupta@intel.com \
--cc=badal.nilawar@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=linux-hwmon@vger.kernel.org \
--cc=linux@roeck-us.net \
--cc=matthew.brost@intel.com \
--cc=riana.tauro@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox