public inbox for intel-xe@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH v2 1/1] drm/xe/eustall: Return EBADFD from read if EU stall registers get reset
@ 2026-03-16 17:58 Harish Chegondi
  2026-03-16 22:04 ` ✓ CI.KUnit: success for series starting with [v2,1/1] " Patchwork
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Harish Chegondi @ 2026-03-16 17:58 UTC (permalink / raw)
  To: intel-xe
  Cc: felix.j.degrood, matias.a.cabral, joshua.santosh.ranjan,
	Harish Chegondi, Ashutosh Dixit

If a reset (GT or engine) happens during EU stall data sampling, all the
EU stall registers can get reset to 0. This will result in EU stall data
buffers' read and write pointer register values to be out of sync with
the cached values. This will result in read() returning invalid data. To
prevent this, check the value of a EU stall base register. If it is zero,
it indicates a reset may have happened that wiped the register to zero.
If this happens, return EBADFD from read() upon which the user space
should close the fd and open a new fd for a new EU stall data
collection session.

Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
---
v2: Move base register check from read to the poll function

 drivers/gpu/drm/xe/xe_eu_stall.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_eu_stall.c b/drivers/gpu/drm/xe/xe_eu_stall.c
index c34408cfd292..7e14de73a2c9 100644
--- a/drivers/gpu/drm/xe/xe_eu_stall.c
+++ b/drivers/gpu/drm/xe/xe_eu_stall.c
@@ -44,6 +44,7 @@ struct per_xecore_buf {
 struct xe_eu_stall_data_stream {
 	bool pollin;
 	bool enabled;
+	bool reset_detected;
 	int wait_num_reports;
 	int sampling_rate_mult;
 	wait_queue_head_t poll_wq;
@@ -428,6 +429,17 @@ static bool eu_stall_data_buf_poll(struct xe_eu_stall_data_stream *stream)
 			set_bit(xecore, stream->data_drop.mask);
 		xecore_buf->write = write_ptr;
 	}
+	/* If a GT or engine reset happens during EU stall sampling,
+	 * all EU stall registers get reset to 0 and the cached values of
+	 * the EU stall data buffers' read pointers are out of sync with
+	 * the register values. This causes invalid data to be returned
+	 * from read(). To prevent this, check the value of a EU stall base
+	 * register. If it is zero, there has been a reset.
+	 */
+	if (unlikely(!xe_gt_mcr_unicast_read_any(gt, XEHPC_EUSTALL_BASE))) {
+		stream->reset_detected = true;
+		min_data_present = true;
+	}
 	mutex_unlock(&stream->xecore_buf_lock);
 
 	return min_data_present;
@@ -554,6 +566,15 @@ static ssize_t xe_eu_stall_stream_read_locked(struct xe_eu_stall_data_stream *st
 		}
 		stream->data_drop.reported_to_user = false;
 	}
+	/* If EU stall registers got reset due to a GT/engine reset,
+	 * continuing with the read() will return invalid data to
+	 * the user space. Just return -EBADFD instead.
+	 */
+	if (unlikely(stream->reset_detected)) {
+		xe_gt_dbg(gt, "EU stall base register has been reset\n");
+		mutex_unlock(&stream->xecore_buf_lock);
+		return -EBADFD;
+	}
 
 	for_each_dss_steering(xecore, gt, group, instance) {
 		ret = xe_eu_stall_data_buf_read(stream, buf, count, &total_size,
@@ -692,6 +713,7 @@ static int xe_eu_stall_stream_enable(struct xe_eu_stall_data_stream *stream)
 		xecore_buf->write = write_ptr;
 		xecore_buf->read = write_ptr;
 	}
+	stream->reset_detected = false;
 	stream->data_drop.reported_to_user = false;
 	bitmap_zero(stream->data_drop.mask, XE_MAX_DSS_FUSE_BITS);
 
@@ -717,7 +739,7 @@ static void eu_stall_data_buf_poll_work_fn(struct work_struct *work)
 		container_of(work, typeof(*stream), buf_poll_work.work);
 	struct xe_gt *gt = stream->gt;
 
-	if (eu_stall_data_buf_poll(stream)) {
+	if (stream->reset_detected || eu_stall_data_buf_poll(stream)) {
 		stream->pollin = true;
 		wake_up(&stream->poll_wq);
 	}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-03-23 20:17 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-16 17:58 [PATCH v2 1/1] drm/xe/eustall: Return EBADFD from read if EU stall registers get reset Harish Chegondi
2026-03-16 22:04 ` ✓ CI.KUnit: success for series starting with [v2,1/1] " Patchwork
2026-03-16 22:51 ` ✓ Xe.CI.BAT: " Patchwork
2026-03-17 23:24 ` ✗ Xe.CI.FULL: failure " Patchwork
2026-03-18  6:57 ` [PATCH v2 1/1] " Dixit, Ashutosh
2026-03-18 16:59   ` Cabral, Matias A
2026-03-18 18:38     ` Cabral, Matias A
2026-03-20 18:35       ` Harish Chegondi
2026-03-18 21:36     ` Harish Chegondi
2026-03-18 21:30   ` Harish Chegondi
2026-03-19  3:55     ` Dixit, Ashutosh
2026-03-20 20:59       ` Harish Chegondi
2026-03-23 20:17         ` Dixit, Ashutosh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox