From: "Summers, Stuart" <stuart.summers@intel.com>
To: "Vivi, Rodrigo" <rodrigo.vivi@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>
Subject: Re: [Intel-xe] [PATCH 1/2] drm/xe: Print devcoredump to drm_info
Date: Thu, 5 Oct 2023 16:26:03 +0000 [thread overview]
Message-ID: <9d192e8a023dcfaa6487f82c0bfd4baf0345f746.camel@intel.com> (raw)
In-Reply-To: <ZR7gpm90CTkEQA+P@intel.com>
On Thu, 2023-10-05 at 12:13 -0400, Rodrigo Vivi wrote:
> On Thu, Oct 05, 2023 at 11:38:10AM -0400, Summers, Stuart wrote:
> > On Wed, 2023-10-04 at 09:37 -0400, Rodrigo Vivi wrote:
> > > On Tue, Sep 26, 2023 at 09:20:55PM +0000, Stuart Summers wrote:
> > > > Currently the devcoredump is available in the file system
> > > > for a user. However 1) CI isn't currently set up to dump
> > > > this data and 2) if we try to dump this during the driver
> > > > load when a failure is observed, the driver will be torn
> > > > down before we have a chance to actually read this data.
> > > >
> > > > Add a quick routine to print this out with drm_info as well
> > > > if CONFIG_DRM_XE_DEBUG is enabled.
> > >
> > > I feel this is kind of a duplication of simple_error_capture()
> > > that I thought we could kill... at least that has a more specific
> >
> > What was the reason to kill it?
>
> just ignore that thought. I thought it was not useful after we
> got the devcoredump, but you just showed a good case to keep.
>
> >
> > > config: CONFIG_DRM_XE_SIMPLE_ERROR_CAPTURE
> > >
> > > I mean, maybe the right thing to do is to kill that and have
> > > this one here, but with a dedicated config?!...
> >
> > I feel like the ability to dump extra information when an error
> > happens
> > during driver load is critical for debug. We could come up with a
> > way
> > to dump this during teardown to some other file that is persistent
> > -
> > even in /tmp or something?
>
> what I mean now is to just keep one version of it, or keep that
> simple error capture, or the one that you are proposing, but not
> both.
> and with a dedicated config like this
> CONFIG_DRM_XE_SIMPLE_ERROR_CAPTURE
> instead of the reuse of the generic debug config like you had here.
Ok makes a lot of sense now, thanks Rodrigo. I'll clean it up and
resend after some related debug I'm doing.
Thanks,
Stuart
>
> >
> > Thanks,
> > Stuart
> >
> > >
> > > >
> > > > Signed-off-by: Stuart Summers <stuart.summers@intel.com>
> > > > ---
> > > > drivers/gpu/drm/xe/xe_devcoredump.c | 17 ++++++++++++++++-
> > > > drivers/gpu/drm/xe/xe_devcoredump.h | 3 +++
> > > > 2 files changed, 19 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c
> > > > b/drivers/gpu/drm/xe/xe_devcoredump.c
> > > > index 68abc0b195be..aa41d8e9b602 100644
> > > > --- a/drivers/gpu/drm/xe/xe_devcoredump.c
> > > > +++ b/drivers/gpu/drm/xe/xe_devcoredump.c
> > > > @@ -78,7 +78,11 @@ static ssize_t xe_devcoredump_read(char
> > > > *buffer,
> > > > loff_t offset,
> > > > iter.remain = count;
> > > >
> > > > ss = &coredump->snapshot;
> > > > - p = drm_coredump_printer(&iter);
> > > > +
> > > > + if (iter.data)
> > > > + p = drm_coredump_printer(&iter);
> > > > + else
> > > > + p = drm_info_printer(coredump_to_xe(coredump)-
> > > > > drm.dev);
> > > >
> > > > drm_printf(&p, "**** Xe Device Coredump ****\n");
> > > > drm_printf(&p, "kernel: " UTS_RELEASE "\n");
> > > > @@ -102,6 +106,15 @@ static ssize_t xe_devcoredump_read(char
> > > > *buffer, loff_t offset,
> > > > return count - iter.remain;
> > > > }
> > > >
> > > > +/* Print the coredump locally also for debug purposes */
> > > > +void
> > > > +xe_devcoredump_print(struct xe_devcoredump *coredump)
> > > > +{
> > > > +#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
> > > > + xe_devcoredump_read(NULL, 0, 0, coredump, 0);
> > > > +#endif
> > > > +}
> > > > +
> > > > static void xe_devcoredump_free(void *data)
> > > > {
> > > > struct xe_devcoredump *coredump = data;
> > > > @@ -192,5 +205,7 @@ void xe_devcoredump(struct xe_exec_queue
> > > > *q)
> > > >
> > > > dev_coredumpm(xe->drm.dev, THIS_MODULE, coredump, 0,
> > > > GFP_KERNEL,
> > > > xe_devcoredump_read,
> > > > xe_devcoredump_free);
> > > > +
> > > > + xe_devcoredump_print(coredump);
> > > > }
> > > > #endif
> > > > diff --git a/drivers/gpu/drm/xe/xe_devcoredump.h
> > > > b/drivers/gpu/drm/xe/xe_devcoredump.h
> > > > index 6ac218a5c194..ba0c2a7b71b4 100644
> > > > --- a/drivers/gpu/drm/xe/xe_devcoredump.h
> > > > +++ b/drivers/gpu/drm/xe/xe_devcoredump.h
> > > > @@ -8,6 +8,9 @@
> > > >
> > > > struct xe_device;
> > > > struct xe_exec_queue;
> > > > +struct xe_devcoredump;
> > > > +
> > > > +void xe_devcoredump_print(struct xe_devcoredump *coredump);
> > > >
> > > > #ifdef CONFIG_DEV_COREDUMP
> > > > void xe_devcoredump(struct xe_exec_queue *q);
> > > > --
> > > > 2.34.1
> > > >
> >
prev parent reply other threads:[~2023-10-05 16:26 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-26 21:20 [Intel-xe] [PATCH 1/2] drm/xe: Print devcoredump to drm_info Stuart Summers
2023-09-26 21:20 ` [Intel-xe] [PATCH 2/2] drm/xe: Add coredump to wa_bb timeouts Stuart Summers
2023-10-04 13:40 ` Rodrigo Vivi
2023-10-05 15:39 ` Summers, Stuart
2023-09-26 21:23 ` [Intel-xe] ✓ CI.Patch_applied: success for series starting with [1/2] drm/xe: Print devcoredump to drm_info Patchwork
2023-09-26 21:23 ` [Intel-xe] ✓ CI.checkpatch: " Patchwork
2023-09-26 21:24 ` [Intel-xe] ✓ CI.KUnit: " Patchwork
2023-09-26 21:31 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-09-26 21:32 ` [Intel-xe] ✓ CI.Hooks: " Patchwork
2023-09-26 21:33 ` [Intel-xe] ✓ CI.checksparse: " Patchwork
2023-09-26 22:05 ` [Intel-xe] ✓ CI.BAT: " Patchwork
2023-10-04 13:37 ` [Intel-xe] [PATCH 1/2] " Rodrigo Vivi
2023-10-05 15:38 ` Summers, Stuart
2023-10-05 16:13 ` Rodrigo Vivi
2023-10-05 16:26 ` Summers, Stuart [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9d192e8a023dcfaa6487f82c0bfd4baf0345f746.camel@intel.com \
--to=stuart.summers@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=rodrigo.vivi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox