From: "Souza, Jose" <jose.souza@intel.com>
To: "Intel-Xe@Lists.FreeDesktop.Org" <Intel-Xe@Lists.FreeDesktop.Org>,
"Harrison, John C" <john.c.harrison@intel.com>,
"Vivi, Rodrigo" <rodrigo.vivi@intel.com>,
"De Marchi, Lucas" <lucas.demarchi@intel.com>
Cc: "Filipchuk, Julia" <julia.filipchuk@intel.com>
Subject: Re: [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info
Date: Thu, 12 Dec 2024 20:30:59 +0000 [thread overview]
Message-ID: <2e25469bbaab8d1ee3d70b1bbbf295faa6220dd8.camel@intel.com> (raw)
In-Reply-To: <ea46ec11-b7a5-4103-8fdb-493e2489a688@intel.com>
On Thu, 2024-12-12 at 12:06 -0800, John Harrison wrote:
> On 12/12/2024 11:31, Souza, Jose wrote:
> > On Thu, 2024-12-12 at 10:59 -0800, John Harrison wrote:
> > > On 12/12/2024 10:17, Souza, Jose wrote:
> > > > On Wed, 2024-10-02 at 17:46 -0700, John.C.Harrison@Intel.com wrote:
> > > > > From: John Harrison <John.C.Harrison@Intel.com>
> > > > >
> > > > > The xe_guc_exec_queue_snapshot is not really a GuC internal thing and
> > > > > is definitely not a GuC CT thing. So give it its own section heading.
> > > > > The snapshot itself is really a capture of the submission backend's
> > > > > internal state. Although all it currently prints out is the submission
> > > > > contexts. So label it as 'Contexts'. If more general state is added
> > > > > later then it could be change to 'Submission backend' or some such.
> > > > >
> > > > > Further, everything from the GuC CT section onwards is GT specific but
> > > > > there was no indication of which GT it was related to (and that is
> > > > > impossible to work out from the other fields that are given). So add a
> > > > > GT section heading. Also include the tile id of the GT, because again
> > > > > significant information.
> > > > >
> > > > > Lastly, drop a couple of unnecessary line feeds within sections.
> > > > >
> > > > > v2: Add GT section heading, add tile id to device section.
> > > > >
> > > > > Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> > > > > Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
> > > > > ---
> > > > > drivers/gpu/drm/xe/xe_devcoredump.c | 5 +++++
> > > > > drivers/gpu/drm/xe/xe_devcoredump_types.h | 3 ++-
> > > > > drivers/gpu/drm/xe/xe_device.c | 1 +
> > > > > drivers/gpu/drm/xe/xe_guc_submit.c | 2 +-
> > > > > drivers/gpu/drm/xe/xe_hw_engine.c | 1 -
> > > > > 5 files changed, 9 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
> > > > > index d23719d5c2a3..2690f1d1cde4 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_devcoredump.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
> > > > > @@ -96,8 +96,13 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
> > > > > drm_printf(&p, "Process: %s\n", ss->process_name);
> > > > > xe_device_snapshot_print(xe, &p);
> > > > >
> > > > > + drm_printf(&p, "\n**** GT #%d ****\n", ss->gt->info.id);
> > > > > + drm_printf(&p, "\tTile: %d\n", ss->gt->tile->id);
> > > > > +
> > > > > drm_puts(&p, "\n**** GuC CT ****\n");
> > > > > xe_guc_ct_snapshot_print(ss->ct, &p);
> > > > > +
> > > > > + drm_puts(&p, "\n**** Contexts ****\n");
> > > > > xe_guc_exec_queue_snapshot_print(ss->ge, &p);
> > > > This broke Mesa parser!
> > > > It can't now parse the exec_queue context because it was expected to be on the '**** GuC CT ****' section.
> > > Then the mesa parse needs to be updated. That was clearly a bug - exec
> > > queue contexts are absolutely not GuC CT data and should not be in the
> > > GuC CT section.
> > Don't matter if it is a bug or not, it broke the parser.
> > If this is not reverted we will have older Kernel versions that don't work with newer Mesa and newer Kernel versions that don't with old Mesa.
> Debug tools cannot count as UAPI that must never change.
That is not my understating from previous threads.
Imagine that a big costumer file a bug to us and attach the devcoredump of a older kernel version.
devcoredump parser will not work. If the developer is aware of this "contract" break he can checkout to a older UMD version, build it and then parse
the devcoredump. Then checkout again to main/master branch and work on the fix... Not viable at all.
At least UMD teams should be notified. At the moment Mesa debugging is blocked because of this patches.
>
> The devcoredump contains much information that is essentially the
> internals of the kernel. It is going to change. That is about the only
> guarantee that we can make about it. And saying that we must
> intentionally break the output of a developer only debug feature in
> order to support older mesa is plain wrong. End users do not care about
> debug tools. All user applications will still work just perfectly.
>
> We can start adding version numbers to the devcoredump format if we
> really need to. But that was already shot down as a bad idea. It is
> debug information and not UAPI. So version incompatibilities are
> expected from time to time.
>
> John.
>
>
> >
> > > John.
> > >
> > > > >
> > > > > drm_puts(&p, "\n**** Job ****\n");
> > > > > diff --git a/drivers/gpu/drm/xe/xe_devcoredump_types.h b/drivers/gpu/drm/xe/xe_devcoredump_types.h
> > > > > index 440d05d77a5a..3cc2f095fdfb 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_devcoredump_types.h
> > > > > +++ b/drivers/gpu/drm/xe/xe_devcoredump_types.h
> > > > > @@ -37,7 +37,8 @@ struct xe_devcoredump_snapshot {
> > > > > /* GuC snapshots */
> > > > > /** @ct: GuC CT snapshot */
> > > > > struct xe_guc_ct_snapshot *ct;
> > > > > - /** @ge: Guc Engine snapshot */
> > > > > +
> > > > > + /** @ge: GuC Submission Engine snapshot */
> > > > > struct xe_guc_submit_exec_queue_snapshot *ge;
> > > > >
> > > > > /** @hwe: HW Engine snapshot array */
> > > > > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > > > > index 09a7ad830e69..030cf703e970 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_device.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_device.c
> > > > > @@ -961,6 +961,7 @@ void xe_device_snapshot_print(struct xe_device *xe, struct drm_printer *p)
> > > > >
> > > > > for_each_gt(gt, xe, id) {
> > > > > drm_printf(p, "GT id: %u\n", id);
> > > > > + drm_printf(p, "\tTile: %u\n", gt->tile->id);
> > > > > drm_printf(p, "\tType: %s\n",
> > > > > gt->info.type == XE_GT_TYPE_MAIN ? "main" : "media");
> > > > > drm_printf(p, "\tIP ver: %u.%u.%u\n",
> > > > > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > > index 0ac4a19ec9cc..8690df699170 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > > @@ -2240,7 +2240,7 @@ xe_guc_exec_queue_snapshot_print(struct xe_guc_submit_exec_queue_snapshot *snaps
> > > > > if (!snapshot)
> > > > > return;
> > > > >
> > > > > - drm_printf(p, "\nGuC ID: %d\n", snapshot->guc.id);
> > > > > + drm_printf(p, "GuC ID: %d\n", snapshot->guc.id);
> > > > > drm_printf(p, "\tName: %s\n", snapshot->name);
> > > > > drm_printf(p, "\tClass: %d\n", snapshot->class);
> > > > > drm_printf(p, "\tLogical mask: 0x%x\n", snapshot->logical_mask);
> > > > > diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c b/drivers/gpu/drm/xe/xe_hw_engine.c
> > > > > index ea6d9ef7fab6..6c9c27304cdc 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_hw_engine.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_hw_engine.c
> > > > > @@ -1084,7 +1084,6 @@ void xe_hw_engine_snapshot_print(struct xe_hw_engine_snapshot *snapshot,
> > > > > if (snapshot->hwe->class == XE_ENGINE_CLASS_COMPUTE)
> > > > > drm_printf(p, "\tRCU_MODE: 0x%08x\n",
> > > > > snapshot->reg.rcu_mode);
> > > > > - drm_puts(p, "\n");
> > > > > }
> > > > >
> > > > > /**
>
next prev parent reply other threads:[~2024-12-12 20:31 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-03 0:46 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
2024-10-03 0:46 ` [PATCH v9 01/11] drm/xe/guc: Remove spurious line feed in debug print John.C.Harrison
2024-10-03 0:46 ` [PATCH v9 02/11] drm/xe/devcoredump: Use drm_puts and already cached local variables John.C.Harrison
2024-10-03 0:46 ` [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info John.C.Harrison
2024-12-12 18:17 ` Souza, Jose
2024-12-12 18:59 ` John Harrison
2024-12-12 19:31 ` Souza, Jose
2024-12-12 20:06 ` John Harrison
2024-12-12 20:30 ` Souza, Jose [this message]
2024-12-12 20:38 ` John Harrison
2024-10-03 0:46 ` [PATCH v9 04/11] drm/xe/devcoredump: Add ASCII85 dump helper function John.C.Harrison
2024-12-12 17:41 ` Souza, Jose
2024-12-12 18:45 ` Lucas De Marchi
2024-12-12 19:14 ` John Harrison
2024-12-12 20:52 ` Lucas De Marchi
2024-12-12 21:04 ` John Harrison
2024-12-13 0:32 ` Lucas De Marchi
2024-12-13 16:36 ` John Harrison
2024-12-13 17:20 ` Lucas De Marchi
2024-12-13 17:34 ` John Harrison
2024-12-13 14:18 ` Rodrigo Vivi
2024-12-13 16:42 ` John Harrison
2024-10-03 0:46 ` [PATCH v9 05/11] drm/xe/guc: Copy GuC log prior to dumping John.C.Harrison
2024-10-03 0:46 ` [PATCH v9 06/11] drm/xe/guc: Use a two stage dump for GuC logs and add more info John.C.Harrison
2024-10-08 21:18 ` [v9, " Kees Bakker
2024-10-03 0:46 ` [PATCH v9 07/11] drm/print: Introduce drm_line_printer John.C.Harrison
2024-10-04 13:57 ` Maarten Lankhorst
2024-10-03 0:46 ` [PATCH v9 08/11] drm/xe/guc: Dead CT helper John.C.Harrison
2024-10-03 0:46 ` [PATCH v9 09/11] drm/xe/guc: Dump entire CTB on errors John.C.Harrison
2024-10-03 0:46 ` [PATCH v9 10/11] drm/xe/guc: Add GuC log to devcoredump captures John.C.Harrison
2024-10-03 0:46 ` [PATCH v9 11/11] drm/xe/guc: Add a helper function for dumping GuC log to dmesg John.C.Harrison
2024-10-03 1:15 ` ✓ CI.Patch_applied: success for drm/xe/guc: Improve GuC log dumping and add to devcoredump (rev6) Patchwork
2024-10-03 1:15 ` ✗ CI.checkpatch: warning " Patchwork
2024-10-03 1:17 ` ✓ CI.KUnit: success " Patchwork
2024-10-03 1:28 ` ✓ CI.Build: " Patchwork
2024-10-03 1:30 ` ✓ CI.Hooks: " Patchwork
2024-10-03 1:32 ` ✗ CI.checksparse: warning " Patchwork
2024-10-03 1:49 ` ✓ CI.BAT: success " Patchwork
2024-10-03 2:40 ` ✗ CI.FULL: failure " Patchwork
-- strict thread matches above, loose matches on Subject: below --
2024-10-02 21:14 [PATCH v9 00/11] drm/xe/guc: Improve GuC log dumping and add to devcoredump John.C.Harrison
2024-10-02 21:14 ` [PATCH v9 03/11] drm/xe/devcoredump: Improve section headings and add tile info John.C.Harrison
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2e25469bbaab8d1ee3d70b1bbbf295faa6220dd8.camel@intel.com \
--to=jose.souza@intel.com \
--cc=Intel-Xe@Lists.FreeDesktop.Org \
--cc=john.c.harrison@intel.com \
--cc=julia.filipchuk@intel.com \
--cc=lucas.demarchi@intel.com \
--cc=rodrigo.vivi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox