Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/6] Maintenence of devcoredump <-> GuC-Err-Capture plumbing
@ 2025-01-28 18:36 Alan Previn
  2025-01-28 18:36 ` [PATCH v6 1/6] drm/xe/guc: Rename __guc_capture_parsed_output Alan Previn
                   ` (13 more replies)
  0 siblings, 14 replies; 28+ messages in thread
From: Alan Previn @ 2025-01-28 18:36 UTC (permalink / raw)
  To: intel-xe
  Cc: Alan Previn, dri-devel, Daniele Ceraolo Spurio, John Harrison,
	Matthew Brost, Zhanjun Dong, Rodrigo Vivi

The GuC-Error-Capture is currently reaching into xe_devcoredump
structure to store its own place-holder snaphot-ptr to workaround
the race between G2H-Error-Capture-Notification vs Drm-Scheduler
triggering GuC-Submission-exec-queue-timeout/kill.

From a subsystem layering perspective, this isn't scalable as
GuC should not be manipulating contents of a global structure it
does not own when responding to an unrelated thread / callstack.

Also, part of the earlier mentioned workaround includes the
GuC-Error-Capture taking on one of the front-end functions
for xe_hw_engine_snapshot generation because of an orthogonal
debugfs-caller requesting raw dumps of engine registers without
a job. This request is better handled by GuC-Error-Capture since
there is a lot to manage for reading and printing engine
register lists and we want to avoid duplicate code or tables.

However, logically speaking, the GuC-Error-Capture output node
is really a subset of xe_hw_engine_snapshot. This is irregardless
of the fact that the majority of an engine-snapshot is the
register dumps that only the GuC-Error-Capture can do.

That said, this series intends to refactor the plumbing between
Guc-Error-Capture and xe_devcoredump (including
xe_hw_engine_snapshot) to fix the layering for future
maintenence and scalability. This is done without changing
any functionality and IP-locality (i.e. GuC-Error-Capture still owns
the single point of engine register list definition and printing).
This series ensures 'xe_devcoredump_snapshot' owns
'xe_hw_engine_snapshot generation' and the latter owns
'xe_guc_capture_snapshot' retrieval (with GuC-Error-Capture
as its helper).

Alan Previn (6):
  drm/xe/guc: Rename __guc_capture_parsed_output
  drm/xe/guc: Don't store capture nodes in xe_devcoredump_snapshot
  drm/xe/guc: Split engine state print between xe_hw_engine vs
    xe_guc_capture
  drm/xe/guc: Move xe_hw_engine_snapshot creation back to xe_hw_engine.c
  drm/xe/xe_hw_engine: Update hw_engine_snapshot_capture for debugfs
  drm/xe/guc: Update comments on GuC-Err-Capture flows

 drivers/gpu/drm/xe/xe_devcoredump.c           |   3 -
 drivers/gpu/drm/xe/xe_devcoredump_types.h     |   6 -
 drivers/gpu/drm/xe/xe_guc_capture.c           | 365 ++++++++----------
 drivers/gpu/drm/xe/xe_guc_capture.h           |  16 +-
 .../drm/xe/xe_guc_capture_snapshot_types.h    |  53 +++
 drivers/gpu/drm/xe/xe_guc_submit.c            |  12 +-
 drivers/gpu/drm/xe/xe_hw_engine.c             | 111 ++++--
 drivers/gpu/drm/xe/xe_hw_engine.h             |   4 +-
 drivers/gpu/drm/xe/xe_hw_engine_types.h       |  13 +-
 9 files changed, 319 insertions(+), 264 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_guc_capture_snapshot_types.h


base-commit: 8b47c9cdb6a78364fe68f8af0abfd6f265577001
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2025-02-12 19:25 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-28 18:36 [PATCH v6 0/6] Maintenence of devcoredump <-> GuC-Err-Capture plumbing Alan Previn
2025-01-28 18:36 ` [PATCH v6 1/6] drm/xe/guc: Rename __guc_capture_parsed_output Alan Previn
2025-01-30 22:37   ` Rodrigo Vivi
2025-01-31 18:44     ` Teres Alexis, Alan Previn
2025-02-10 19:01   ` Dong, Zhanjun
2025-01-28 18:36 ` [PATCH v6 2/6] drm/xe/guc: Don't store capture nodes in xe_devcoredump_snapshot Alan Previn
2025-01-30 17:57   ` Teres Alexis, Alan Previn
2025-02-10 23:41   ` Dong, Zhanjun
2025-02-12 19:25     ` Teres Alexis, Alan Previn
2025-01-28 18:36 ` [PATCH v6 3/6] drm/xe/guc: Split engine state print between xe_hw_engine vs xe_guc_capture Alan Previn
2025-01-30 22:42   ` Rodrigo Vivi
2025-01-31 18:55     ` Teres Alexis, Alan Previn
2025-02-10 18:45       ` Teres Alexis, Alan Previn
2025-01-28 18:36 ` [PATCH v6 4/6] drm/xe/guc: Move xe_hw_engine_snapshot creation back to xe_hw_engine.c Alan Previn
2025-01-30 22:43   ` Rodrigo Vivi
2025-01-31 18:56     ` Teres Alexis, Alan Previn
2025-01-28 18:36 ` [PATCH v6 5/6] drm/xe/xe_hw_engine: Update hw_engine_snapshot_capture for debugfs Alan Previn
2025-01-28 20:45   ` kernel test robot
2025-01-28 18:36 ` [PATCH v6 6/6] drm/xe/guc: Update comments on GuC-Err-Capture flows Alan Previn
2025-01-28 21:19 ` ✓ CI.Patch_applied: success for Maintenence of devcoredump <-> GuC-Err-Capture plumbing Patchwork
2025-01-28 21:21 ` ✗ CI.checkpatch: warning " Patchwork
2025-01-28 21:22 ` ✓ CI.KUnit: success " Patchwork
2025-01-28 21:38 ` ✓ CI.Build: " Patchwork
2025-01-28 21:40 ` ✗ CI.Hooks: failure " Patchwork
2025-01-28 21:41 ` ✓ CI.checksparse: success " Patchwork
2025-01-28 22:01 ` ✓ Xe.CI.BAT: " Patchwork
2025-01-29 12:54 ` ✗ Xe.CI.Full: failure " Patchwork
2025-01-30 17:13   ` Teres Alexis, Alan Previn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox