From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>
Cc: intel-xe@lists.freedesktop.org
Subject: Re: [PATCH] drm/xe/reg_sr: Add debugfs to verify status of reg_sr programming
Date: Mon, 09 Feb 2026 15:56:11 -0800 [thread overview]
Message-ID: <87wm0llbdw.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <20260207002304.585103-2-matthew.d.roper@intel.com>
On Fri, 06 Feb 2026 16:23:05 -0800, Matt Roper wrote:
>
> When applying save-restore register programming for workarounds, tuning
> settings, and general device configuration we assume the programming was
> successful. However there are a number of cases where the desired
> reg_sr programming can become lost:
>
> - workarounds implemented on the wrong RTP table might not get
> saved/restored at the right time leading to, for example, failure to
> re-apply the programming after engine resets
I guess this can be checked in the code, post reset, and we could print
that reg s/r is wrong, rather than rely on debugfs. But anyway this is
future patch, if at all.
> - some hardware registers become "locked" and can no longer be updated
> after firmware or the driver finishes initializing them
> - sometimes the hardware teams just made a mistake when documenting the
> register and/or bits that needed to be programmed
>
> Add a debugfs entry that will read back the registers referenced on a
> GT's save-restore lists and print any cases where the desired
> programming is no longer in effect. Such cases might indicate the
> presence of a driver/firmware bug, might indicate that the documentation
> we were following has a mistake, or might be benign (occasionally
> registers have broken read-back capability preventing verification, but
> previous writes were still successful and effective).
>
> For now we only verify the GT and engine reg_sr lists. Verifying the
> LRC list will require checking the expected programming against the
> default_lrc contents, not the live registers (which may not reflect the
> reg_sr programming if no context is actively running).
>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_debugfs.c | 26 +++++++++++++++++++++++
> drivers/gpu/drm/xe/xe_reg_sr.c | 34 ++++++++++++++++++++++++++++++
> drivers/gpu/drm/xe/xe_reg_sr.h | 3 +++
> 3 files changed, 63 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_debugfs.c b/drivers/gpu/drm/xe/xe_gt_debugfs.c
> index 4363bc9c3606..aa43427a9f4b 100644
> --- a/drivers/gpu/drm/xe/xe_gt_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_gt_debugfs.c
> @@ -155,6 +155,30 @@ static int register_save_restore(struct xe_gt *gt, struct drm_printer *p)
> return 0;
> }
>
> +/*
> + * Check the registers referenced on a save-restore list and report any
> + * save-restore entries that did not get applied.
> + */
> +static int register_save_restore_check(struct xe_gt *gt, struct drm_printer *p)
> +{
> + struct xe_hw_engine *hwe;
> + enum xe_hw_engine_id id;
> +
> + CLASS(xe_force_wake, fw_ref)(gt_to_fw(gt), XE_FORCEWAKE_ALL);
> + if (!xe_force_wake_ref_has_domain(fw_ref.domains, XE_FORCEWAKE_ALL)) {
> + drm_printf(p, "ERROR: Could not acquire forcewake\n");
> + return -ETIMEDOUT;
> + }
> +
> + xe_reg_sr_readback_check(>->reg_sr, gt, p);
> + for_each_hw_engine(hwe, gt, id)
> + xe_reg_sr_readback_check(&hwe->reg_sr, gt, p);
> +
> + /* TODO: Check hwe->reg_lrc against contents of default_lrc. */
> +
> + return 0;
> +}
> +
> static int rcs_default_lrc(struct xe_gt *gt, struct drm_printer *p)
> {
> xe_lrc_dump_default(p, gt, XE_ENGINE_CLASS_RENDER);
> @@ -209,6 +233,8 @@ static const struct drm_info_list vf_safe_debugfs_list[] = {
> { "default_lrc_vecs", .show = xe_gt_debugfs_show_with_rpm, .data = vecs_default_lrc },
> { "hwconfig", .show = xe_gt_debugfs_show_with_rpm, .data = hwconfig },
> { "pat_sw_config", .show = xe_gt_debugfs_simple_show, .data = xe_pat_dump_sw_config },
> + { "register-save-restore-check",
> + .show = xe_gt_debugfs_show_with_rpm, .data = register_save_restore_check },
Not sure about this, so asking: can "register-save-restore-check" be added
to vf_safe_debugfs_list[] or must it be added to pf_only_debugfs_list[]?
> };
>
> /* everything else should be added here */
> diff --git a/drivers/gpu/drm/xe/xe_reg_sr.c b/drivers/gpu/drm/xe/xe_reg_sr.c
> index d3e13ea33123..37555d5211d8 100644
> --- a/drivers/gpu/drm/xe/xe_reg_sr.c
> +++ b/drivers/gpu/drm/xe/xe_reg_sr.c
> @@ -204,3 +204,37 @@ void xe_reg_sr_dump(struct xe_reg_sr *sr, struct drm_printer *p)
> str_yes_no(entry->reg.masked),
> str_yes_no(entry->reg.mcr));
> }
> +
> +static u32 readback_reg(struct xe_gt *gt, struct xe_reg reg)
> +{
> + struct xe_reg_mcr mcr_reg = to_xe_reg_mcr(reg);
> +
> + if (reg.mcr)
> + return xe_gt_mcr_unicast_read_any(gt, mcr_reg);
> + else
> + return xe_mmio_read32(>->mmio, reg);
> +}
> +
> +/**
> + * xe_reg_sr_readback_check() - Readback registers referenced in save/restore
> + * entries and check whether the programming is in place.
> + * @sr: Save/restore entries
> + * @gt: GT to read register from
> + * @p: DRM printer to report discrepancies on
> + */
> +void xe_reg_sr_readback_check(struct xe_reg_sr *sr,
> + struct xe_gt *gt,
> + struct drm_printer *p)
> +{
> + struct xe_reg_sr_entry *entry;
> + unsigned long offset;
> +
Not sure if it is desirable/possible to add a "xe_force_wake_assert_held"
here...
Assuming the comments above are ok, LGTM:
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
> + xa_for_each(&sr->xa, offset, entry) {
> + u32 val = readback_reg(gt, entry->reg);
> + u32 mask = entry->clr_bits | entry->set_bits;
> +
> + if ((val & mask) != entry->set_bits)
> + drm_printf(p, "%#8lx & %#10x :: expected %#10x got %#10x\n",
> + offset, mask, entry->set_bits, val & mask);
> + }
> +}
> diff --git a/drivers/gpu/drm/xe/xe_reg_sr.h b/drivers/gpu/drm/xe/xe_reg_sr.h
> index 51fbba423e27..cd133a09aa9b 100644
> --- a/drivers/gpu/drm/xe/xe_reg_sr.h
> +++ b/drivers/gpu/drm/xe/xe_reg_sr.h
> @@ -19,6 +19,9 @@ struct drm_printer;
>
> int xe_reg_sr_init(struct xe_reg_sr *sr, const char *name, struct xe_device *xe);
> void xe_reg_sr_dump(struct xe_reg_sr *sr, struct drm_printer *p);
> +void xe_reg_sr_readback_check(struct xe_reg_sr *sr,
> + struct xe_gt *gt,
> + struct drm_printer *p);
>
> int xe_reg_sr_add(struct xe_reg_sr *sr, const struct xe_reg_sr_entry *e,
> struct xe_gt *gt);
> --
> 2.52.0
>
next prev parent reply other threads:[~2026-02-09 23:56 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-07 0:23 [PATCH] drm/xe/reg_sr: Add debugfs to verify status of reg_sr programming Matt Roper
2026-02-07 0:29 ` ✓ CI.KUnit: success for " Patchwork
2026-02-07 1:09 ` ✓ Xe.CI.BAT: " Patchwork
2026-02-07 21:51 ` ✗ Xe.CI.FULL: failure " Patchwork
2026-02-09 23:56 ` Dixit, Ashutosh [this message]
2026-02-11 0:48 ` [PATCH] " Matt Roper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wm0llbdw.wl-ashutosh.dixit@intel.com \
--to=ashutosh.dixit@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.d.roper@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.