From: "Souza, Jose" <jose.souza@intel.com>
To: "De Marchi, Lucas" <lucas.demarchi@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
"Harrison, John C" <john.c.harrison@intel.com>,
"Vivi, Rodrigo" <rodrigo.vivi@intel.com>,
"stable@vger.kernel.org" <stable@vger.kernel.org>,
"Filipchuk, Julia" <julia.filipchuk@intel.com>
Subject: Re: [PATCH 1/2] drm/xe/devcoredump: Move exec queue snapshot to Contexts section
Date: Mon, 27 Jan 2025 20:27:46 +0000 [thread overview]
Message-ID: <7edb31ae5efc9d146feb446dc95e13e573f74dea.camel@intel.com> (raw)
In-Reply-To: <7c63b8da8dbf20b630f8dfb33858d7269441f4d8.camel@intel.com>
On Thu, 2025-01-23 at 09:04 -0800, José Roberto de Souza wrote:
> On Thu, 2025-01-23 at 10:45 -0600, Lucas De Marchi wrote:
> > On Thu, Jan 23, 2025 at 08:52:13AM -0600, Jose Souza wrote:
> > > On Thu, 2025-01-23 at 08:30 -0600, Lucas De Marchi wrote:
> > > > On Thu, Jan 23, 2025 at 08:14:11AM -0600, Jose Souza wrote:
> > > > > On Wed, 2025-01-22 at 21:11 -0800, Lucas De Marchi wrote:
> > > > > > Having the exec queue snapshot inside a "GuC CT" section was always
> > > > > > wrong. Commit c28fd6c358db ("drm/xe/devcoredump: Improve section
> > > > > > headings and add tile info") tried to fix that bug, but with that also
> > > > > > broke the mesa tool that parses the devcoredump, hence it was reverted
> > > > > > in commit 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa
> > > > > > debug tool").
> > > > > >
> > > > > > With the mesa tool also fixed, this can propagate as a fix on both
> > > > > > kernel and userspace side to avoid unnecessary headache for a debug
> > > > > > feature.
> > > > >
> > > > > This will break older versions of the Mesa parser. Is this really necessary?
> > > >
> > > > See cover letter with the mesa MR that would fix the tool to follow the
> > > > kernel fix and work with both newer and older format. Linking it here
> > > > anyway: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33177
> > >
> > > Still someone running the older version of the parser with a new Xe KMD would not be able to parse it.
> > > I understand that we can break it but is this really worthy? not in my opinion.
> >
> > because for the debug nature of this file, it's hard if we always keep
> > the cruft around. In 5 years developers implementing new decoders will
> > have to get data from random places because we will notice things are
> > wrongly placed.
> >
> > isn't it easier to do do it early so we don't increase the exposure
> > and just say "kernel screwed that up and fixed it", then propagate the
> > change in mesa in a stable release, just like we are doing in the
> > kernel?
> >
> > >
> > > >
> > > > It's a fix so simple that IMO it's better than carrying the cruft ad
> > > > infinitum on all the tools that may possibly parse the devcoredump.
> > > >
> > > >
> > > > > Is it worth breaking the tool? In my opinion, it is not.
> > > > >
> > > > > Also, do we need to discuss this now? Wouldn't it be better to focus on bringing the GuC log in first?
> > > >
> > > > That's what the second patch does. We need to discuss both now and
> > > > decide, otherwise we can't re-enable it and have either the guc log
> > > > parser or mesa's aubinator_error_decode_xe broken.
> > >
> > > I can't understand why it needs both, could you explain further?
> >
> > we are already discussing it, why not? Also as I said there's the guc
> > log parser, another tool, that is already expecting it in the other
> > place. So if we are going to re-enable the guc log, it's the best
> > opportunity to fix this, otherwise we will probably never do it and keep
> > accumulating.
>
> Much easier change a internal tool, no?
>
> The GuC log is in other section("**** GuC Log ****"), to me this change is more cosmetic than function and not worthy to break older versions of the
> parser.
>
> I can live with that but you will need to convince other Intel Mesa engineers, when I mentioned that Mesa parser was broken in a Mesa staff meeting, I
> was asked to push hard to get the Xe KMD patch reverted.
> That is why I think it should be taken care in another patch series as it could take more time...
After more discussion offline and on Mesa MR this is:
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
>
> >
> > Lucas De Marchi
> >
> > >
> > > >
> > > > Lucas De Marchi
> > > >
> > > > >
> > > > > >
> > > > > > Cc: John Harrison <John.C.Harrison@Intel.com>
> > > > > > Cc: Julia Filipchuk <julia.filipchuk@intel.com>
> > > > > > Cc: José Roberto de Souza <jose.souza@intel.com>
> > > > > > Cc: stable@vger.kernel.org
> > > > > > Fixes: 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa debug tool")
> > > > > > Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
> > > > > > ---
> > > > > > drivers/gpu/drm/xe/xe_devcoredump.c | 6 +-----
> > > > > > 1 file changed, 1 insertion(+), 5 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c
> > > > > > index 81dc7795c0651..a7946a76777e7 100644
> > > > > > --- a/drivers/gpu/drm/xe/xe_devcoredump.c
> > > > > > +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
> > > > > > @@ -119,11 +119,7 @@ static ssize_t __xe_devcoredump_read(char *buffer, size_t count,
> > > > > > drm_puts(&p, "\n**** GuC CT ****\n");
> > > > > > xe_guc_ct_snapshot_print(ss->guc.ct, &p);
> > > > > >
> > > > > > - /*
> > > > > > - * Don't add a new section header here because the mesa debug decoder
> > > > > > - * tool expects the context information to be in the 'GuC CT' section.
> > > > > > - */
> > > > > > - /* drm_puts(&p, "\n**** Contexts ****\n"); */
> > > > > > + drm_puts(&p, "\n**** Contexts ****\n");
> > > > > > xe_guc_exec_queue_snapshot_print(ss->ge, &p);
> > > > > >
> > > > > > drm_puts(&p, "\n**** Job ****\n");
> > > > >
> > >
>
next prev parent reply other threads:[~2025-01-27 20:27 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-23 5:11 [PATCH 0/2] Fix devcoredump and guc log dump Lucas De Marchi
2025-01-23 5:11 ` [PATCH 1/2] drm/xe/devcoredump: Move exec queue snapshot to Contexts section Lucas De Marchi
2025-01-23 14:14 ` Souza, Jose
2025-01-23 14:30 ` Lucas De Marchi
2025-01-23 14:52 ` Souza, Jose
2025-01-23 16:45 ` Lucas De Marchi
2025-01-23 17:04 ` Souza, Jose
2025-01-27 20:27 ` Souza, Jose [this message]
2025-01-23 5:11 ` [PATCH 2/2] drm/xe: Fix and re-enable xe_print_blob_ascii85() Lucas De Marchi
2025-01-23 17:51 ` Souza, Jose
2025-01-23 18:25 ` John Harrison
2025-01-23 18:36 ` Lucas De Marchi
2025-01-23 19:31 ` John Harrison
2025-01-23 5:25 ` ✓ CI.Patch_applied: success for Fix devcoredump and guc log dump Patchwork
2025-01-23 5:25 ` ✓ CI.checkpatch: " Patchwork
2025-01-23 5:26 ` ✓ CI.KUnit: " Patchwork
2025-01-23 5:48 ` ✓ CI.Build: " Patchwork
2025-01-23 5:51 ` ✗ CI.Hooks: failure " Patchwork
2025-01-23 5:53 ` ✓ CI.checksparse: success " Patchwork
2025-01-23 6:20 ` ✓ Xe.CI.BAT: " Patchwork
2025-01-23 15:53 ` ✗ Xe.CI.Full: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7edb31ae5efc9d146feb446dc95e13e573f74dea.camel@intel.com \
--to=jose.souza@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=john.c.harrison@intel.com \
--cc=julia.filipchuk@intel.com \
--cc=lucas.demarchi@intel.com \
--cc=rodrigo.vivi@intel.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox