From: Raag Jadav <raag.jadav@intel.com>
To: Karthik Poosa <karthik.poosa@intel.com>
Cc: intel-xe@lists.freedesktop.org, anshuman.gupta@intel.com,
badal.nilawar@intel.com, rodrigo.vivi@intel.com
Subject: Re: [PATCH v4 3/4] drm/xe/hwmon: Expose GPU pcie temperature
Date: Fri, 9 Jan 2026 14:26:45 +0100 [thread overview]
Message-ID: <aWECFeIZ_64ZHBaV@black.igk.intel.com> (raw)
In-Reply-To: <20260108130323.426531-4-karthik.poosa@intel.com>
On Thu, Jan 08, 2026 at 06:33:22PM +0530, Karthik Poosa wrote:
> Expose GPU PCIe average temperature and its limits via hwmon
> sysfs temp5_xxx.
Same comments as last patch. Also, use PCIe in subject.
> Update Xe hwmon sysfs documentation for this.
>
> v2: Update kernel version in Xe hwmon documentation. (Raag)
>
> v3:
> - Address review comments from Raag.
> - Remove redundant debug log.
> - Update kernel version in Xe hwmon documentation. (Raag)
>
> Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
> ---
> .../ABI/testing/sysfs-driver-intel-xe-hwmon | 24 ++++++++++++++
> drivers/gpu/drm/xe/xe_hwmon.c | 32 +++++++++++++++++++
> drivers/gpu/drm/xe/xe_pcode_api.h | 4 ++-
> 3 files changed, 59 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
> index a9fcfa6f11b9..6041805a5efc 100644
> --- a/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
> +++ b/Documentation/ABI/testing/sysfs-driver-intel-xe-hwmon
> @@ -260,3 +260,27 @@ Contact: intel-xe@lists.freedesktop.org
> Description: RO. Memory controller critical temperature in millidegree Celsius.
>
> Only supported for particular Intel Xe graphics platforms.
> +
> +What: /sys/bus/pci/drivers/xe/.../hwmon/hwmon<i>/temp5_input
> +Date: January 2026
> +KernelVersion: 7.0
> +Contact: intel-xe@lists.freedesktop.org
> +Description: RO. GPU PCIe temperature in millidegree Celsius.
> +
> + Only supported for particular Intel Xe graphics platforms.
> +
> +What: /sys/bus/pci/drivers/xe/.../hwmon/hwmon<i>/temp5_emergency
> +Date: January 2026
> +KernelVersion: 7.0
> +Contact: intel-xe@lists.freedesktop.org
> +Description: RO. GPU PCIe shutdown temperature in millidegree Celsius.
> +
> + Only supported for particular Intel Xe graphics platforms.
> +
> +What: /sys/bus/pci/drivers/xe/.../hwmon/hwmon<i>/temp5_crit
> +Date: January 2026
> +KernelVersion: 7.0
> +Contact: intel-xe@lists.freedesktop.org
> +Description: RO. GPU PCIe critical temperature in millidegree Celsius.
> +
> + Only supported for particular Intel Xe graphics platforms.
Same comments as last patch.
> diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
> index 2bf5c9ac948a..317e30c4e1f1 100644
> --- a/drivers/gpu/drm/xe/xe_hwmon.c
> +++ b/drivers/gpu/drm/xe/xe_hwmon.c
> @@ -44,6 +44,7 @@ enum xe_hwmon_channel {
> CHANNEL_PKG,
> CHANNEL_VRAM,
> CHANNEL_MCTRL,
> + CHANNEL_PCIE,
> CHANNEL_MAX,
> };
>
> @@ -714,6 +715,7 @@ static const struct hwmon_channel_info * const hwmon_info[] = {
> HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_EMERGENCY | HWMON_T_CRIT |
> HWMON_T_MAX,
> HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_EMERGENCY | HWMON_T_CRIT,
> + HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_EMERGENCY | HWMON_T_CRIT,
Alphabetic order please!
> HWMON_T_LABEL | HWMON_T_INPUT | HWMON_T_EMERGENCY | HWMON_T_CRIT),
> HWMON_CHANNEL_INFO(power, HWMON_P_MAX | HWMON_P_RATED_MAX | HWMON_P_LABEL | HWMON_P_CRIT |
> HWMON_P_CAP,
> @@ -781,6 +783,27 @@ static int get_mc_temp(struct xe_hwmon *hwmon, long *val)
> return 0;
> }
>
> +static int get_pcie_temp(struct xe_hwmon *hwmon, long *val)
> +{
> + struct xe_tile *root_tile = xe_device_get_root_tile(hwmon->xe);
> + int ret = 0;
Redundant initialization.
> + u32 data = 0;
> +
> + ret = xe_pcode_read(root_tile, PCODE_MBOX(PCODE_THERMAL_INFO, READ_THERMAL_DATA,
> + PCIE_SENSOR_GROUP_ID), &data, NULL);
> + if (ret)
> + return ret;
> +
> + /* Sensor offset is different for G21 */
> + if (hwmon->xe->info.subplatform != XE_SUBPLATFORM_BATTLEMAGE_G21)
> + data >>= PCIE_SENSOR_SHIFT;
Rather,
#define PCIE_SENSOR_MASK REG_GENMASK(30, 16)
data = REG_FIELD_GET(PCIE_SENSOR_MASK, data);
> + data &= TEMP_MASK_MAILBOX;
Don't we already have TEMP_MASK?
> + *val = (s8)data * MILLIDEGREE_PER_DEGREE;
> +
> + return 0;
> +}
> +
> /* I1 is exposed as power_crit or as curr_crit depending on bit 31 */
> static int xe_hwmon_pcode_read_i1(const struct xe_hwmon *hwmon, u32 *uval)
> {
> @@ -886,6 +909,7 @@ xe_hwmon_temp_is_visible(struct xe_hwmon *hwmon, u32 attr, int channel)
> case CHANNEL_VRAM:
> return hwmon->temp.limit[TEMP_LIMIT_MEM_SHUTDOWN] ? 0444 : 0;
> case CHANNEL_MCTRL:
> + case CHANNEL_PCIE:
> return hwmon->temp.count ? 0444 : 0;
> default:
> return 0;
> @@ -898,6 +922,7 @@ xe_hwmon_temp_is_visible(struct xe_hwmon *hwmon, u32 attr, int channel)
> case CHANNEL_VRAM:
> return hwmon->temp.limit[TEMP_LIMIT_MEM_TJMAX] ? 0444 : 0;
> case CHANNEL_MCTRL:
> + case CHANNEL_PCIE:
> return hwmon->temp.count ? 0444 : 0;
> default:
> return 0;
> @@ -919,6 +944,7 @@ xe_hwmon_temp_is_visible(struct xe_hwmon *hwmon, u32 attr, int channel)
> return xe_reg_is_valid(xe_hwmon_get_reg(hwmon, REG_TEMP,
> channel)) ? 0444 : 0;
> case CHANNEL_MCTRL:
> + case CHANNEL_PCIE:
> return hwmon->temp.count ? 0444 : 0;
> default:
> return 0;
> @@ -946,12 +972,15 @@ xe_hwmon_temp_read(struct xe_hwmon *hwmon, u32 attr, int channel, long *val)
> break;
> case CHANNEL_MCTRL:
> return get_mc_temp(hwmon, val);
> + case CHANNEL_PCIE:
> + return get_pcie_temp(hwmon, val);
> }
> break;
> case hwmon_temp_emergency:
> switch (channel) {
> case CHANNEL_PKG:
> case CHANNEL_MCTRL:
> + case CHANNEL_PCIE:
> *val = hwmon->temp.limit[TEMP_LIMIT_PKG_SHUTDOWN] * MILLIDEGREE_PER_DEGREE;
> break;
> case CHANNEL_VRAM:
> @@ -963,6 +992,7 @@ xe_hwmon_temp_read(struct xe_hwmon *hwmon, u32 attr, int channel, long *val)
> switch (channel) {
> case CHANNEL_PKG:
> case CHANNEL_MCTRL:
> + case CHANNEL_PCIE:
> *val = hwmon->temp.limit[TEMP_LIMIT_PKG_TJMAX] * MILLIDEGREE_PER_DEGREE;
> break;
> case CHANNEL_VRAM:
> @@ -1341,6 +1371,8 @@ static int xe_hwmon_read_label(struct device *dev,
> *str = "vram";
> else if (channel == CHANNEL_MCTRL)
> *str = "mctrl";
> + else if (channel == CHANNEL_PCIE)
> + *str = "pcie";
> return 0;
> case hwmon_power:
> case hwmon_energy:
> diff --git a/drivers/gpu/drm/xe/xe_pcode_api.h b/drivers/gpu/drm/xe/xe_pcode_api.h
> index fc8811a87741..dd7635bbc4e7 100644
> --- a/drivers/gpu/drm/xe/xe_pcode_api.h
> +++ b/drivers/gpu/drm/xe/xe_pcode_api.h
> @@ -70,7 +70,9 @@
> #define READ_THERMAL_CONFIG 0x1
> #define READ_THERMAL_DATA 0x2
> #define TEMP_INDEX_MCTRL 0x2
> -#define TEMP_MASK_MAILBOX REG_GENMASK8(6, 0)
> +#define TEMP_MASK_MAILBOX REG_GENMASK8(7, 0)
> +#define PCIE_SENSOR_GROUP_ID 0x2
The convention for submacros is double space here, so let's make
it consistent.
Raag
> +#define PCIE_SENSOR_SHIFT 16
>
> #define PCODE_FREQUENCY_CONFIG 0x6e
> /* Frequency Config Sub Commands (param1) */
> --
> 2.25.1
>
next prev parent reply other threads:[~2026-01-09 13:26 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-08 13:03 [PATCH v4 0/4] drm/xe/hwmon: Expose new temperature attributes Karthik Poosa
2026-01-08 13:03 ` [PATCH v4 1/4] drm/xe/hwmon: Expose temperature limits Karthik Poosa
2026-01-09 9:29 ` Raag Jadav
2026-01-09 13:42 ` Poosa, Karthik
2026-01-09 14:24 ` Poosa, Karthik
2026-01-08 13:03 ` [PATCH v4 2/4] drm/xe/hwmon: Expose memory controller temperature Karthik Poosa
2026-01-09 10:56 ` Raag Jadav
2026-01-08 13:03 ` [PATCH v4 3/4] drm/xe/hwmon: Expose GPU pcie temperature Karthik Poosa
2026-01-09 13:26 ` Raag Jadav [this message]
2026-01-08 13:03 ` [PATCH v4 4/4] drm/xe/hwmon: Expose individual vram channel temperature Karthik Poosa
2026-01-08 13:20 ` ✓ CI.KUnit: success for drm/xe/hwmon: Expose new temperature attributes (rev5) Patchwork
2026-01-08 13:58 ` ✓ Xe.CI.BAT: " Patchwork
2026-01-08 19:02 ` ✓ Xe.CI.Full: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aWECFeIZ_64ZHBaV@black.igk.intel.com \
--to=raag.jadav@intel.com \
--cc=anshuman.gupta@intel.com \
--cc=badal.nilawar@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=karthik.poosa@intel.com \
--cc=rodrigo.vivi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.