All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/6] Maintenence of devcoredump <-> GuC-Err-Capture plumbing
@ 2025-01-28 18:36 Alan Previn
  2025-01-28 18:36 ` [PATCH v6 1/6] drm/xe/guc: Rename __guc_capture_parsed_output Alan Previn
                   ` (13 more replies)
  0 siblings, 14 replies; 28+ messages in thread
From: Alan Previn @ 2025-01-28 18:36 UTC (permalink / raw)
  To: intel-xe
  Cc: Alan Previn, dri-devel, Daniele Ceraolo Spurio, John Harrison,
	Matthew Brost, Zhanjun Dong, Rodrigo Vivi

The GuC-Error-Capture is currently reaching into xe_devcoredump
structure to store its own place-holder snaphot-ptr to workaround
the race between G2H-Error-Capture-Notification vs Drm-Scheduler
triggering GuC-Submission-exec-queue-timeout/kill.

From a subsystem layering perspective, this isn't scalable as
GuC should not be manipulating contents of a global structure it
does not own when responding to an unrelated thread / callstack.

Also, part of the earlier mentioned workaround includes the
GuC-Error-Capture taking on one of the front-end functions
for xe_hw_engine_snapshot generation because of an orthogonal
debugfs-caller requesting raw dumps of engine registers without
a job. This request is better handled by GuC-Error-Capture since
there is a lot to manage for reading and printing engine
register lists and we want to avoid duplicate code or tables.

However, logically speaking, the GuC-Error-Capture output node
is really a subset of xe_hw_engine_snapshot. This is irregardless
of the fact that the majority of an engine-snapshot is the
register dumps that only the GuC-Error-Capture can do.

That said, this series intends to refactor the plumbing between
Guc-Error-Capture and xe_devcoredump (including
xe_hw_engine_snapshot) to fix the layering for future
maintenence and scalability. This is done without changing
any functionality and IP-locality (i.e. GuC-Error-Capture still owns
the single point of engine register list definition and printing).
This series ensures 'xe_devcoredump_snapshot' owns
'xe_hw_engine_snapshot generation' and the latter owns
'xe_guc_capture_snapshot' retrieval (with GuC-Error-Capture
as its helper).

Alan Previn (6):
  drm/xe/guc: Rename __guc_capture_parsed_output
  drm/xe/guc: Don't store capture nodes in xe_devcoredump_snapshot
  drm/xe/guc: Split engine state print between xe_hw_engine vs
    xe_guc_capture
  drm/xe/guc: Move xe_hw_engine_snapshot creation back to xe_hw_engine.c
  drm/xe/xe_hw_engine: Update hw_engine_snapshot_capture for debugfs
  drm/xe/guc: Update comments on GuC-Err-Capture flows

 drivers/gpu/drm/xe/xe_devcoredump.c           |   3 -
 drivers/gpu/drm/xe/xe_devcoredump_types.h     |   6 -
 drivers/gpu/drm/xe/xe_guc_capture.c           | 365 ++++++++----------
 drivers/gpu/drm/xe/xe_guc_capture.h           |  16 +-
 .../drm/xe/xe_guc_capture_snapshot_types.h    |  53 +++
 drivers/gpu/drm/xe/xe_guc_submit.c            |  12 +-
 drivers/gpu/drm/xe/xe_hw_engine.c             | 111 ++++--
 drivers/gpu/drm/xe/xe_hw_engine.h             |   4 +-
 drivers/gpu/drm/xe/xe_hw_engine_types.h       |  13 +-
 9 files changed, 319 insertions(+), 264 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_guc_capture_snapshot_types.h


base-commit: 8b47c9cdb6a78364fe68f8af0abfd6f265577001
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2025-02-12 19:25 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-28 18:36 [PATCH v6 0/6] Maintenence of devcoredump <-> GuC-Err-Capture plumbing Alan Previn
2025-01-28 18:36 ` [PATCH v6 1/6] drm/xe/guc: Rename __guc_capture_parsed_output Alan Previn
2025-01-30 22:37   ` Rodrigo Vivi
2025-01-31 18:44     ` Teres Alexis, Alan Previn
2025-02-10 19:01   ` Dong, Zhanjun
2025-01-28 18:36 ` [PATCH v6 2/6] drm/xe/guc: Don't store capture nodes in xe_devcoredump_snapshot Alan Previn
2025-01-30 17:57   ` Teres Alexis, Alan Previn
2025-02-10 23:41   ` Dong, Zhanjun
2025-02-12 19:25     ` Teres Alexis, Alan Previn
2025-01-28 18:36 ` [PATCH v6 3/6] drm/xe/guc: Split engine state print between xe_hw_engine vs xe_guc_capture Alan Previn
2025-01-30 22:42   ` Rodrigo Vivi
2025-01-31 18:55     ` Teres Alexis, Alan Previn
2025-02-10 18:45       ` Teres Alexis, Alan Previn
2025-01-28 18:36 ` [PATCH v6 4/6] drm/xe/guc: Move xe_hw_engine_snapshot creation back to xe_hw_engine.c Alan Previn
2025-01-30 22:43   ` Rodrigo Vivi
2025-01-31 18:56     ` Teres Alexis, Alan Previn
2025-01-28 18:36 ` [PATCH v6 5/6] drm/xe/xe_hw_engine: Update hw_engine_snapshot_capture for debugfs Alan Previn
2025-01-28 20:45   ` kernel test robot
2025-01-28 18:36 ` [PATCH v6 6/6] drm/xe/guc: Update comments on GuC-Err-Capture flows Alan Previn
2025-01-28 21:19 ` ✓ CI.Patch_applied: success for Maintenence of devcoredump <-> GuC-Err-Capture plumbing Patchwork
2025-01-28 21:21 ` ✗ CI.checkpatch: warning " Patchwork
2025-01-28 21:22 ` ✓ CI.KUnit: success " Patchwork
2025-01-28 21:38 ` ✓ CI.Build: " Patchwork
2025-01-28 21:40 ` ✗ CI.Hooks: failure " Patchwork
2025-01-28 21:41 ` ✓ CI.checksparse: success " Patchwork
2025-01-28 22:01 ` ✓ Xe.CI.BAT: " Patchwork
2025-01-29 12:54 ` ✗ Xe.CI.Full: failure " Patchwork
2025-01-30 17:13   ` Teres Alexis, Alan Previn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.