Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Lucas De Marchi <lucas.demarchi@intel.com>
To: intel-xe@lists.freedesktop.org
Cc: Lucas De Marchi <lucas.demarchi@intel.com>,
	Raag Jadav <raag.jadav@intel.com>,
	Rodrigo Vivi <rodrigo.vivi@intel.com>
Subject: [PATCH v2 8/8] drm/xe/gt_throttle: Avoid TOCTOU when monitoring reasons
Date: Sun, 26 Oct 2025 22:57:20 -0700	[thread overview]
Message-ID: <20251026-gt-throttle-cri-v2-8-41f8288a71a7@intel.com> (raw)
In-Reply-To: <20251026-gt-throttle-cri-v2-0-41f8288a71a7@intel.com>

It's currently not possible to safely monitor if there's throttling
happening and what are the reasons. The approach of reading the status
and then reading the reasons is not reliable as by the time sysadmin
reads the reason, the throttling could not be happening anymore.

Previous tentative to fix that[1] was breaking the ABI and potentially
sysadmin's scripts. This takes a different approach of adding and
documenting the additional attribute. It's still valuable, though
redundant, to provide the simpler 0/1 interface.

In order to avoid userspace knowledge on the bitmask meaning and to be
able to maintain the kernel side in sync with possible changes in
future, just walk the attribute group and check what are the masks that
match the value read.

[1] https://lore.kernel.org/intel-xe/20241025092238.167042-1-raag.jadav@intel.com/

Cc: Raag Jadav <raag.jadav@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_gt_throttle.c | 46 +++++++++++++++++++++++++++++++++++--
 1 file changed, 44 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt_throttle.c b/drivers/gpu/drm/xe/xe_gt_throttle.c
index fa7068aac3344..fd2988dacbbb6 100644
--- a/drivers/gpu/drm/xe/xe_gt_throttle.c
+++ b/drivers/gpu/drm/xe/xe_gt_throttle.c
@@ -22,9 +22,15 @@
  * Their availability depend on the platform and some may not be visible if that
  * reason is not available.
  *
+ * The ``status_reasons`` attribute can be used by sysadmin monitoring all
+ * possible reasons for throttling and reporting them. It's preferred over
+ * monitoring ``status`` and then reading the reason both for simplicity and to
+ * avoid TOCTOU.
+ *
  * The following attributes are available on Crescent Island platform:
  *
- * - ``status``: Overall throttle status
+ * - ``status``: Overall throttle status (0: no throttling, 1: throttling)
+ * - ``status_reasons``: All reasons causing throttling separated by newline.
  * - ``reason_pl1``: package PL1
  * - ``reason_pl2``: package PL2
  * - ``reason_pl4``: package PL4
@@ -43,7 +49,8 @@
  *
  * Other platforms support the following reasons:
  *
- * - ``status``: Overall status
+ * - ``status``: Overall throttle status (0: no throttling, 1: throttling)
+ * - ``status_reasons``: All reasons causing throttling separated by newline.
  * - ``reason_pl1``: package PL1
  * - ``reason_pl2``: package PL2
  * - ``reason_pl4``: package PL4, Iccmax etc.
@@ -111,12 +118,45 @@ static ssize_t reason_show(struct kobject *kobj,
 	return sysfs_emit(buff, "%u\n", is_throttled_by(gt, ta->mask));
 }
 
+static const struct attribute_group *get_platform_throttle_group(struct xe_device *xe);
+
+static ssize_t status_reasons_show(struct kobject *kobj,
+				   struct kobj_attribute *attr, char *buff)
+{
+	struct xe_gt *gt = throttle_to_gt(kobj);
+	struct xe_device *xe = gt_to_xe(gt);
+	const struct attribute_group *group;
+	struct attribute **pother;
+	ssize_t ret = 0;
+	u32 reasons;
+
+	reasons = xe_gt_throttle_get_limit_reasons(gt);
+	group = get_platform_throttle_group(xe);
+
+	for (pother = group->attrs; *pother; pother++) {
+		struct kobj_attribute *kattr = container_of(*pother, struct kobj_attribute, attr);
+		struct throttle_attribute *other_ta = kobj_attribute_to_throttle(kattr);
+
+		if (other_ta->mask != U32_MAX && reasons & other_ta->mask)
+			ret += sysfs_emit_at(buff, ret, "%s\n", (*pother)->name);
+	}
+
+	return ret;
+}
+
 #define THROTTLE_ATTR_RO(name, _mask)				\
 	struct throttle_attribute attr_##name =	{		\
 		.attr = __ATTR(name, 0444, reason_show, NULL),	\
 		.mask = _mask,					\
 	}
 
+#define THROTTLE_ATTR_RO_FUNC(name, _mask, _show)		\
+	struct throttle_attribute attr_##name =	{		\
+		.attr = __ATTR(name, 0444, _show, NULL),	\
+		.mask = _mask,					\
+	}
+
+static THROTTLE_ATTR_RO_FUNC(status_reasons, 0, status_reasons_show);
 static THROTTLE_ATTR_RO(status, U32_MAX);
 static THROTTLE_ATTR_RO(reason_pl1, POWER_LIMIT_1_MASK);
 static THROTTLE_ATTR_RO(reason_pl2, POWER_LIMIT_2_MASK);
@@ -128,6 +168,7 @@ static THROTTLE_ATTR_RO(reason_vr_thermalert, VR_THERMALERT_MASK);
 static THROTTLE_ATTR_RO(reason_vr_tdc, VR_TDC_MASK);
 
 static struct attribute *throttle_attrs[] = {
+	&attr_status_reasons.attr.attr,
 	&attr_status.attr.attr,
 	&attr_reason_pl1.attr.attr,
 	&attr_reason_pl2.attr.attr,
@@ -153,6 +194,7 @@ static THROTTLE_ATTR_RO(reason_psys_crit, PSYS_CRIT_MASK);
 
 static struct attribute *cri_throttle_attrs[] = {
 	/* Common */
+	&attr_status_reasons.attr.attr,
 	&attr_status.attr.attr,
 	&attr_reason_pl1.attr.attr,
 	&attr_reason_pl2.attr.attr,

-- 
2.51.0


  parent reply	other threads:[~2025-10-27  5:58 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-27  5:57 [PATCH v2 0/8] drm/xe: CRI support in gt_throttle + refactors Lucas De Marchi
2025-10-27  5:57 ` [PATCH v2 1/8] drm/xe/cri: Add new performance limit reasons bits Lucas De Marchi
2025-10-27  5:57 ` [PATCH v2 2/8] drm/xe/gt_throttle: Tidy up perf reasons reading Lucas De Marchi
2025-10-27  5:57 ` [PATCH v2 3/8] drm/xe/gt_throttle: Always read and mask Lucas De Marchi
2025-10-27  5:57 ` [PATCH v2 4/8] drm/xe/gt_throttle: Add throttle_to_gt() Lucas De Marchi
2025-10-27  5:57 ` [PATCH v2 5/8] drm/xe/gt_throttle: Tidy up attribute definition Lucas De Marchi
2025-10-27 11:38   ` Raag Jadav
2025-10-27  5:57 ` [PATCH v2 6/8] drm/xe: Improve freq and throttle documentation Lucas De Marchi
2025-10-27 11:43   ` Raag Jadav
2025-10-27  5:57 ` [PATCH v2 7/8] drm/xe/gt_throttle: Drop individual show functions Lucas De Marchi
2025-10-27 12:15   ` Raag Jadav
2025-10-27  5:57 ` Lucas De Marchi [this message]
2025-10-27 11:50   ` [PATCH v2 8/8] drm/xe/gt_throttle: Avoid TOCTOU when monitoring reasons Raag Jadav
2025-10-27 13:26     ` Lucas De Marchi
2025-10-28  5:24       ` Raag Jadav
2025-10-28 14:02   ` Rodrigo Vivi
2025-10-28 16:04     ` Lucas De Marchi
2025-10-29 20:24       ` Rodrigo Vivi
2025-10-27  6:04 ` ✗ CI.checkpatch: warning for drm/xe: CRI support in gt_throttle + refactors (rev2) Patchwork
2025-10-27  6:05 ` ✓ CI.KUnit: success " Patchwork
2025-10-27  6:51 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-27  8:25 ` ✗ Xe.CI.Full: failure " Patchwork
2025-10-27 11:38 ` [PATCH v2 0/8] drm/xe: CRI support in gt_throttle + refactors Raag Jadav

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251026-gt-throttle-cri-v2-8-41f8288a71a7@intel.com \
    --to=lucas.demarchi@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=raag.jadav@intel.com \
    --cc=rodrigo.vivi@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox