From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: "Poosa, Karthik" <karthik.poosa@intel.com>
Cc: Raag Jadav <raag.jadav@intel.com>,
<intel-xe@lists.freedesktop.org>, <anshuman.gupta@intel.com>,
<badal.nilawar@intel.com>
Subject: Re: [PATCH v5 4/4] drm/xe/hwmon: Expose individual vram channel temperature
Date: Mon, 12 Jan 2026 12:23:17 -0500 [thread overview]
Message-ID: <aWUuBYoZt9xmAlcf@intel.com> (raw)
In-Reply-To: <67e9d831-361b-4691-8a52-bc6d0695b7b5@intel.com>
On Mon, Jan 12, 2026 at 05:15:35PM +0530, Poosa, Karthik wrote:
>
> On 12-01-2026 13:41, Raag Jadav wrote:
> > On Sun, Jan 11, 2026 at 12:52:39AM +0530, Poosa, Karthik wrote:
> > > On 10-01-2026 21:53, Raag Jadav wrote:
> > > > On Sat, Jan 10, 2026 at 01:46:44AM +0530, Karthik Poosa wrote:
> > ...
> >
> > > > > +static inline bool is_vram_ch_available(struct xe_hwmon *hwmon, int channel)
> > > > > +{
> > > > > + struct xe_reg vram_ch_temp;
> > > > > + struct xe_mmio *mmio = xe_root_tile_mmio(hwmon->xe);
> > > > > +
> > > > > + vram_ch_temp = xe_hwmon_get_reg(hwmon, REG_TEMP, channel);
> > > > > + if (xe_reg_is_valid(vram_ch_temp) && xe_mmio_read32(mmio, vram_ch_temp)) {
> > > > > + /* Create label only for available vram channel */
> > > > > + sprintf(hwmon->temp.vram_label[channel - CHANNEL_VRAM_N], "vram_ch_%d",
> > > > > + (channel - CHANNEL_VRAM_N));
> > > > > + return 1;
> > > > > + }
> > > > > + return 0;
> > > > > +}
> > > > I'd write this as
> > > >
> > > > static inline bool is_vram_ch_available(struct xe_hwmon *hwmon, int channel)
> > > > {
> > > > struct xe_mmio *mmio = xe_root_tile_mmio(hwmon->xe);
> > > > int vram_id = channel - CHANNEL_VRAM_N;
> > > > struct xe_reg vram_reg;
> > > >
> > > > vram_reg = xe_hwmon_get_reg(hwmon, REG_TEMP, channel);i!
> > > > if (!xe_reg_is_valid(vram_reg) || !xe_mmio_read32(mmio, vram_reg))
> > > > return false;
> > > >
> > > > /* Create label only for available vram channel */
> > > > sprintf(hwmon->temp.vram_label[vram_id], "vram_ch_%d", vram_id);
> > > > return true;
> > > > }
> > > I'll agree with vram_id and boolean values, for readability,
> > > other than that I would like to stick to current implementation.
> > The usual practice is early return negative cases, but upto you.
> okay we can do that in next revision
> >
> > Also, just curious: Do we need the 'ch' string here? We already know
> > it's channel, right?
> yes to differentiate between existing channel "vram", I've appended with
> _ch_X
> > > > > static umode_t
> > > > > xe_hwmon_temp_is_visible(struct xe_hwmon *hwmon, u32 attr, int channel)
> > > > > {
> > > > > @@ -903,6 +944,8 @@ xe_hwmon_temp_is_visible(struct xe_hwmon *hwmon, u32 attr, int channel)
> > > > > case CHANNEL_MCTRL:
> > > > > case CHANNEL_PCIE:
> > > > > return hwmon->temp.count ? 0444 : 0;
> > > > > + case CHANNEL_VRAM_N...CHANNEL_VRAM_N_MAX:
> > > > > + return is_vram_ch_available(hwmon, channel) ? 0444 : 0;
> > > > Shouldn't we also check hwmon->temp.limit[TEMP_LIMIT_MEM_SHUTDOWN]?
> > > that can be secondary check, then this would apply to all channels !
> > For the channels that return data from the mailbox, we'd want to make sure
> > the data source is also working. Else we'll have dummy attributes exposing
> > no useful data.
> okay, I shall add this in next revision
> >
> > > > > default:
> > > > > return 0;
> > > > > }
> > > > > @@ -915,6 +958,8 @@ xe_hwmon_temp_is_visible(struct xe_hwmon *hwmon, u32 attr, int channel)
> > > > > case CHANNEL_MCTRL:
> > > > > case CHANNEL_PCIE:
> > > > > return hwmon->temp.count ? 0444 : 0;
> > > > > + case CHANNEL_VRAM_N...CHANNEL_VRAM_N_MAX:
> > > > > + return is_vram_ch_available(hwmon, channel) ? 0444 : 0;
> > > > Ditto, hwmon->temp.limit[TEMP_LIMIT_MEM_CRIT]?
> > > >
> > > > > default:
> > > > > return 0;
> > > > > }
> > > > > @@ -935,6 +980,8 @@ xe_hwmon_temp_is_visible(struct xe_hwmon *hwmon, u32 attr, int channel)
> > > > > case CHANNEL_MCTRL:
> > > > > case CHANNEL_PCIE:
> > > > > return hwmon->temp.count ? 0444 : 0;
> > > > > + case CHANNEL_VRAM_N...CHANNEL_VRAM_N_MAX:
> > > > > + return is_vram_ch_available(hwmon, channel) ? 0444 : 0;
> > > > > default:
> > > > > return 0;
> > > > > }
> > > > > @@ -963,6 +1010,16 @@ xe_hwmon_temp_read(struct xe_hwmon *hwmon, u32 attr, int channel, long *val)
> > > > > return get_mc_temp(hwmon, val);
> > > > > case CHANNEL_PCIE:
> > > > > return get_pcie_temp(hwmon, val);
> > > > > + case CHANNEL_VRAM_N...CHANNEL_VRAM_N_MAX:
> > > > > + reg_val = xe_mmio_read32(mmio, xe_hwmon_get_reg(hwmon, REG_TEMP, channel));
> > > > > + /*
> > > > > + * This temperature format is bit 31 for sign, bits [30:8] for whole number
> > > > > + * and bits [7:0] for fraction
> > > > Nit: "Temperature format is 24 bits [31:8] signed integer and
> > > > 8 bits [7:0] fraction."
> > > >
> > > > > + */
> > > > > + *val = (s32)(REG_FIELD_GET(TEMP_MASK_VRAM_N, reg_val)) */
> > > > > + (REG_FIELD_GET(TEMP_SIGN_MASK, reg_val) ? -1 : 1) *
> > > > Since you're already casting it, I'm wondering if you need to check
> > > > for sign?
> > > |REG_FIELD_GET() returns unsigned type, which gets stored |the lower 24 bits
> > > of an |s32|, discarding the sign bit; consequently, negative values are
> > > interpreted as positive, requiring an explicit sign check.
> > Would something like this work?
> >
> > s32 vram_n = (reg_val & TEMP_SIGN_MASK) | REG_FIELD_GET(TEMP_MASK_VRAM_N, reg_val);
> this is okay
> >
> > *val = vram_n * MILLIDEGREE_PER_DEGREE;
>
> but this wont work as it will treat value as unsigned and we'll need to
> again check the sign check here.
>
> existing one does these all in a single statement.
>
> > return 0;
> >
> > > > > + MILLIDEGREE_PER_DEGREE;
> > > > > + return 0;
> > > > > default:
> > > > > return -EOPNOTSUPP;
> > > > > }
> > > > > @@ -974,6 +1031,7 @@ xe_hwmon_temp_read(struct xe_hwmon *hwmon, u32 attr, int channel, long *val)
> > > > > *val = hwmon->temp.limit[TEMP_LIMIT_PKG_SHUTDOWN] * MILLIDEGREE_PER_DEGREE;
> > > > > return 0;
> > > > > case CHANNEL_VRAM:
> > > > > + case CHANNEL_VRAM_N...CHANNEL_VRAM_N_MAX:
> > > > > *val = hwmon->temp.limit[TEMP_LIMIT_MEM_SHUTDOWN] * MILLIDEGREE_PER_DEGREE;
> > > > > return 0;
> > > > > default:
> > > > > @@ -987,6 +1045,7 @@ xe_hwmon_temp_read(struct xe_hwmon *hwmon, u32 attr, int channel, long *val)
> > > > > *val = hwmon->temp.limit[TEMP_LIMIT_PKG_CRIT] * MILLIDEGREE_PER_DEGREE;
> > > > > return 0;
> > > > > case CHANNEL_VRAM:
> > > > > + case CHANNEL_VRAM_N...CHANNEL_VRAM_N_MAX:
> > > > > *val = hwmon->temp.limit[TEMP_LIMIT_MEM_CRIT] * MILLIDEGREE_PER_DEGREE;
> > > > > return 0;
> > > > > default:
> > > > > @@ -1356,16 +1415,20 @@ static int xe_hwmon_read_label(struct device *dev,
> > > > > enum hwmon_sensor_types type,
> > > > > u32 attr, int channel, const char **str)
> > > > > {
> > > > > + struct xe_hwmon *hwmon = dev_get_drvdata(dev);
> > > > > +
> > > > > switch (type) {
> > > > > case hwmon_temp:
> > > > > if (channel == CHANNEL_PKG)
> > > > > *str = "pkg";
> > > > > else if (channel == CHANNEL_VRAM)
> > > > > - *str = "vram";
> > > > > + *str = "vram_avg";
> > > > If you look at the readings this is actually not average, so it's a bit
> > > > misleading.
> > > what is your suggestion for that label here ?
> > Since this is a stable ABI, let's first make sure that we can actually
> > change output string. If we can, then something like vram_high or
> > vram_peak would be more appropriate.
> >
> > Note: This is different from _max attribute which signifies the limit.
> >
> > Raag
>
> 1. Actually, this label change can be a separate patch.
>
> 2. We are modifying label here, ABI still remains same, so this should be
> okay
>
> vram_peak seems apt, as it the current max value of all vrams
>
> Rodrigo, can you also share your comments on this ?
I'd say we can keep it as 'vram' only... like the last version of this
patch is doing. I see no reason to modify it.
And if some tool out there is already relying on these strings, it
might break. So, better to keep and avoid changing it.
>
> >
> > > > > else if (channel == CHANNEL_MCTRL)
> > > > > *str = "mctrl";
> > > > > else if (channel == CHANNEL_PCIE)
> > > > > *str = "pcie";
> > > > > + else if (channel >= CHANNEL_VRAM_N && channel <= CHANNEL_VRAM_N_MAX)
> > > > > + *str = hwmon->temp.vram_label[channel - CHANNEL_VRAM_N];
> > > > > return 0;
> > > > > case hwmon_power:
> > > > > case hwmon_energy:
> > > > > --
> > > > > 2.25.1
> > > > >
next prev parent reply other threads:[~2026-01-12 17:23 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-09 20:16 [PATCH v5 0/4] drm/xe/hwmon: Expose new temperature attributes Karthik Poosa
2026-01-09 20:16 ` [PATCH v5 1/4] drm/xe/hwmon: Expose temperature limits Karthik Poosa
2026-01-10 10:09 ` Raag Jadav
2026-01-12 6:50 ` Poosa, Karthik
2026-01-09 20:16 ` [PATCH v5 2/4] drm/xe/hwmon: Expose memory controller temperature Karthik Poosa
2026-01-10 10:42 ` Raag Jadav
2026-01-12 6:56 ` Poosa, Karthik
2026-01-09 20:16 ` [PATCH v5 3/4] drm/xe/hwmon: Expose GPU pcie temperature Karthik Poosa
2026-01-10 11:13 ` Raag Jadav
2026-01-12 7:05 ` Poosa, Karthik
2026-01-09 20:16 ` [PATCH v5 4/4] drm/xe/hwmon: Expose individual vram channel temperature Karthik Poosa
2026-01-10 16:23 ` Raag Jadav
2026-01-10 19:22 ` Poosa, Karthik
2026-01-12 8:11 ` Raag Jadav
2026-01-12 11:45 ` Poosa, Karthik
2026-01-12 17:23 ` Rodrigo Vivi [this message]
2026-01-09 20:17 ` ✓ CI.KUnit: success for drm/xe/hwmon: Expose new temperature attributes (rev7) Patchwork
2026-01-09 21:25 ` ✓ Xe.CI.BAT: " Patchwork
2026-01-10 2:06 ` ✓ Xe.CI.Full: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aWUuBYoZt9xmAlcf@intel.com \
--to=rodrigo.vivi@intel.com \
--cc=anshuman.gupta@intel.com \
--cc=badal.nilawar@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=karthik.poosa@intel.com \
--cc=raag.jadav@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.