public inbox for igt-dev@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Matt Roper <matthew.d.roper@intel.com>
To: igt-dev@lists.freedesktop.org
Cc: Matt Roper <matthew.d.roper@intel.com>
Subject: [PATCH i-g-t] tests/intel/xe_reg_sr_check: Add test for reg_sr programming failures
Date: Wed, 18 Mar 2026 14:57:51 -0700	[thread overview]
Message-ID: <20260318-reg_sr_check-v1-1-845d09d27bd1@intel.com> (raw)

The kernel provides a 'register-save-restore-check' debugfs entry that
allows developers to easily check and see whether any of the driver's
'register save/restore' (reg_sr) programming is no longer in effect.
Wrap a simple IGT test around this debugfs entry so that CI can help
flag any unexpected changes via a dedicated test.

Note that we're intentionally avoiding i915's approach of having the
driver do immediate readback and verification of workaround/tuning
programming.  That wound up being very problematic since any programming
failure (even benign/expected failures) would show up as a problem on
driver probe, and CI would treat that as a fatal error and refuse to run
any other tests.

At the moment this test will already report gt0 failures on some Xe2
platforms (specifically for workaround registers 0xb104, 0xb108, and
0xb158) --- this reflects a legitimate kernel bug that's been root
caused to incorrect bspec documentation about MCR register steering
(fortunately the bug only affects the register readback used for
verification; the actual programming did indeed reach the hardware as
expected in this case).  The fix for that failure will be implemented in
the kernel once the necessary hardware documentation is available, at
which point this test should start passing on those platforms.

At the moment there's an "exception" list containing one register
(GUC_INTR_CHICKEN_GUC_REG) which is expected to show up in the debugfs
entry.  This is a case where once the KMD completes its initial
programming, ownership of the register transfers to an external agent
(the GuC firmware) and further changes to its value are legitimate and
not indicative of any hardware or software problem.  Other exceptions
may show up in the future, either due to cases where ownership of a
register transfers, or cases where reg_sr programming targets "write
only" registers that are expected to not read back properly.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
---
 tests/intel/xe_reg_sr_check.c | 91 +++++++++++++++++++++++++++++++++++++++++++
 tests/meson.build             |  1 +
 2 files changed, 92 insertions(+)

diff --git a/tests/intel/xe_reg_sr_check.c b/tests/intel/xe_reg_sr_check.c
new file mode 100644
index 0000000000000000000000000000000000000000..a5f0bbd7796b05559c373f13fc7380d5e19ac1b9
--- /dev/null
+++ b/tests/intel/xe_reg_sr_check.c
@@ -0,0 +1,91 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2026 Intel Corporation
+ */
+
+#include <fcntl.h>
+#include <stdio.h>
+
+#include "igt.h"
+#include "igt_debugfs.h"
+#include "xe/xe_query.h"
+
+/**
+ * TEST: Register save-restore persistence
+ * Description:
+ *   Test whether the register programming specified by the kernel's
+ *   register save/restore (reg_sr) lists are still in effect.  If the current
+ *   value of a hardware register (or the value of a register recorded in the
+ *   default LRC in case of the LRC reg_sr) does not reflect the expected
+ *   values, this may indicate a bug in the Xe driver (e.g., attempted register
+ *   programming is not possible, or register was added to the wrong reg_sr
+ *   list) or may reveal a defect in firmware/hardware (e.g., register is
+ *   incorrectly losing its value at unexpected times, or MMIO readout of the
+ *   register is broken).
+ *
+ * Category: Core
+ * Mega feature: General Core features
+ * Sub-category: Debugging
+ *
+ * SUBTEST: check-gt
+ * Description: Check the reg_sr list associated with GTs for missing reg values
+ */
+
+/*
+ * A small number of reg_sr programming mismatches are expected and not
+ * indicative of hardware/software problems.
+ */
+static const unsigned long exceptions[] = {
+	/* GUC_INTR_CHICKEN_GUC_REG: GuC takes ownership after initial programming */
+	0xC50C,
+};
+
+static void check_gt(int fd, int gt)
+{
+	char buf[1024];
+	int debugfs_fd;
+	FILE *file;
+	int problems = 0;
+
+	debugfs_fd = igt_debugfs_gt_open(fd, gt, "register-save-restore-check",
+					 O_RDONLY);
+	igt_require(debugfs_fd);
+	file = fdopen(debugfs_fd, "r");
+	while (fgets(buf, sizeof(buf), file) != NULL) {
+		unsigned long offset = strtoul(buf, NULL, 16);
+		bool ok = false;
+
+		for (int ex = 0; ex < ARRAY_SIZE(exceptions); ex++) {
+			if (offset == exceptions[ex]) {
+				igt_info("Mismatch on %#lx is not a problem\n");
+				ok = true;
+				break;
+			}
+		}
+
+		if (!ok) {
+			igt_warn("Mismatch on %#lx, Driver reports: %s", offset, buf);
+			problems++;
+		}
+	}
+
+	fclose(file);
+	close(debugfs_fd);
+
+	igt_assert_eq(problems, 0);
+}
+
+int igt_main()
+{
+	int fd, gt;
+
+	igt_fixture() {
+		fd = drm_open_driver_master(DRIVER_XE);
+		igt_require(fd >= 0);
+	}
+
+	igt_subtest_with_dynamic("check-gt")
+		xe_for_each_gt(fd, gt)
+			igt_dynamic_f("gt%d", gt)
+				check_gt(fd, gt);
+}
diff --git a/tests/meson.build b/tests/meson.build
index 7e0359a9dc554c4b91c01b9fada4605f76ede515..658f10630e3199c8b15223d9bb928cdcee216e7c 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -329,6 +329,7 @@ intel_xe_progs = [
 	'xe_prime_self_import',
 	'xe_pxp',
 	'xe_query',
+	'xe_reg_sr_check',
 	'xe_render_copy',
 	'xe_vm',
 	'xe_waitfence',

---
base-commit: 4c8773922f643932cc017ba94d164d2b9d3dd546
change-id: 20260312-reg_sr_check-95efc4248b54

Best regards,
-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation


             reply	other threads:[~2026-03-18 21:58 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-18 21:57 Matt Roper [this message]
2026-03-18 22:27 ` ✓ Xe.CI.BAT: success for tests/intel/xe_reg_sr_check: Add test for reg_sr programming failures Patchwork
2026-03-18 22:45 ` ✓ i915.CI.BAT: " Patchwork
2026-03-19 10:50 ` [PATCH i-g-t] " Kamil Konieczny
2026-03-20  0:00 ` ✗ i915.CI.Full: failure for " Patchwork
2026-03-20  6:58 ` ✗ Xe.CI.FULL: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260318-reg_sr_check-v1-1-845d09d27bd1@intel.com \
    --to=matthew.d.roper@intel.com \
    --cc=igt-dev@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox