All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gustavo Sousa <gustavo.sousa@intel.com>
To: "Summers, Stuart" <stuart.summers@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
	"Lin, Shuicheng" <shuicheng.lin@intel.com>,
	"Vivi, Rodrigo" <rodrigo.vivi@intel.com>,
	"Roper, Matthew D" <matthew.d.roper@intel.com>,
	"Brost, Matthew" <matthew.brost@intel.com>,
	"Wajdeczko, Michal" <michal.wajdeczko@intel.com>,
	"Nerlige Ramappa, Umesh" <umesh.nerlige.ramappa@intel.com>,
	"Ceraolo Spurio, Daniele" <daniele.ceraolospurio@intel.com>
Subject: Re: [PATCH 4/9] drm/xe: Add a new debug focused configfs group
Date: Mon, 4 May 2026 16:04:11 -0300	[thread overview]
Message-ID: <87zf2fc7ys.fsf@intel.com> (raw)
In-Reply-To: <dede21f05cac6e683c4f001a8cc31b883ba2810c.camel@intel.com>

"Summers, Stuart" <stuart.summers@intel.com> writes:

> On Mon, 2026-05-04 at 14:28 -0300, Gustavo Sousa wrote:
>> "Summers, Stuart" <stuart.summers@intel.com> writes:
>> 
>> > On Mon, 2026-05-04 at 12:42 -0300, Gustavo Sousa wrote:
>> > > Stuart Summers <stuart.summers@intel.com> writes:
>> > > 
>> > > > Add the skeleton code for a new debug specific configfs group.
>> > > > 
>> > > > Signed-off-by: Stuart Summers <stuart.summers@intel.com>
>> > > > Assisted-by: Copilot:claude-opus-4.7
>> > > > ---
>> > > >  drivers/gpu/drm/xe/Makefile            |  2 ++
>> > > >  drivers/gpu/drm/xe/xe_configfs.c       |  4 ++++
>> > > >  drivers/gpu/drm/xe/xe_configfs_debug.c | 14 ++++++++++++++
>> > > >  drivers/gpu/drm/xe/xe_configfs_debug.h |  8 ++++++++
>> > > >  drivers/gpu/drm/xe/xe_configfs_types.h |  7 +++++++
>> > > >  5 files changed, 35 insertions(+)
>> > > >  create mode 100644 drivers/gpu/drm/xe/xe_configfs_debug.c
>> > > >  create mode 100644 drivers/gpu/drm/xe/xe_configfs_debug.h
>> > > > 
>> > > > diff --git a/drivers/gpu/drm/xe/Makefile
>> > > > b/drivers/gpu/drm/xe/Makefile
>> > > > index 09661f079d03..b58667b0b18e 100644
>> > > > --- a/drivers/gpu/drm/xe/Makefile
>> > > > +++ b/drivers/gpu/drm/xe/Makefile
>> > > > @@ -161,6 +161,8 @@ xe-$(CONFIG_HWMON) += xe_hwmon.o
>> > > >  
>> > > >  xe-$(CONFIG_PERF_EVENTS) += xe_pmu.o
>> > > >  xe-$(CONFIG_CONFIGFS_FS) += xe_configfs.o
>> > > > +xe_debug_configfs_obj-$(CONFIG_DRM_XE_DEBUG) :=
>> > > > xe_configfs_debug.o
>> > > > +xe-$(CONFIG_CONFIGFS_FS) += $(xe_debug_configfs_obj-y)
>> > > 
>> > > What are the downsides of keeping this available for non-debug
>> > > builds?
>> > > 
>> > > I imagine that the debug configfs could be useful for someone not
>> > > involved in kernel development working together with the
>> > > developers
>> > > to
>> > > debug issues without needing to rebuild their kernel.
>> > 
>> > So the focal parameter we had in i915 was enable_rc6. Something
>> > like
>> > this is what we explicitly don't want to expose to a general user
>> > as it
>> > can have unforeseen consequences (like having very high power
>> > usage).
>> > But, having the ability to disable rc6 for debug purposes as a
>> > developer is extremely useful at narrowing down certain types of
>> > bugs
>> > at the hardware/software interface.
>> 
>> We could probably keep the debug stuff available but cause a taint if
>> they ever get used?
>
> So the tainting is really just for our CI I think. This won't
> necessarily impact end users - at least on the client side - who might
> just ignore it (or not even notice it).

I *think* tainting is not only for CI (someone could correct me here).
I see that our driver adds a taint when the user is applying some
parameters (module parameters or configfs) that alters the "officially
supported" behavior of the driver.

Some examples I could find are usages of module_param_named_unsafe() in
the module param front, and setup_configfs_post_ctx_restore_bb(),
setup_configfs_mid_ctx_restore_bb() in configfs.

>
>> 
>> The scenario I had in mind is someone who is not familiar with
>> building
>> their own kernel working together with the development community on a
>> hard-to-reproduce issue and some of the debug configfs attributes
>> being
>> useful to gather more relevant data.
>
> Yeah but IMO we really should be able to instruct such a user how to do
> that - the steps there aren't too difficult if they're already
> installing kernels etc.
>
> A lot of our debug hooks are already held under various Kconfig options
> anyway like DRM_XE_DEBUG, DRM_XE_DEBUG_GUC, DRM_XE_DEBUG_VM, etc. And a
> lot of the information under those Kconfigs are also critical for
> different debugs.

Are those due to performance or resource usage implications if they are
kept enabled?  If so, is this the same case here?

>
> I'd really prefer to keep these under DRM_XE_DEBUG (or some other
> Kconfig) if possible so we can maximize the ease of adding new
> parameters as specific debugs come up.

But why would keeping them under a build flag make it easier to add new
parameters as opposed to always keeping the knobs available for use?

--
Gustavo Sousa

>
> Thanks,
> Stuart
>
>> 
>> --
>> Gustavo Sousa
>> 
>> > 
>> > That's just one example, but there are a bunch of things that could
>> > fit
>> > this category and I expect we'll have even more use cases over
>> > time. 
>> > 
>> > Right now we have to hold these types of changes in a side branch
>> > not
>> > exposed externally and I really want to converge our processes so
>> > we
>> > can more easily make these kinds of changes and let us debug
>> > directly
>> > out of drm-tip.
>> > 
>> > > 
>> > > >  
>> > > >  # graphics virtualization (SR-IOV) support
>> > > >  xe-y += \
>> > > > diff --git a/drivers/gpu/drm/xe/xe_configfs.c
>> > > > b/drivers/gpu/drm/xe/xe_configfs.c
>> > > > index 12b7fe65446d..85df8ce5cf2a 100644
>> > > > --- a/drivers/gpu/drm/xe/xe_configfs.c
>> > > > +++ b/drivers/gpu/drm/xe/xe_configfs.c
>> > > > @@ -1006,6 +1006,10 @@ static struct config_group
>> > > > *xe_config_make_device_group(struct config_group *gro
>> > > >                 config_group_init_type_name(&dev->sriov,
>> > > > "sriov",
>> > > > &xe_config_sriov_type);
>> > > >                 configfs_add_default_group(&dev->sriov, &dev-
>> > > > > group);
>> > > >         }
>> > > > +#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
>> > > > +       config_group_init_type_name(&dev->debug, "debug",
>> > > > &xe_configfs_debug_type);
>> > > > +       configfs_add_default_group(&dev->debug, &dev->group);
>> > > > +#endif
>> > > 
>> > > I guess there isn't much we can do about it today, but, based on
>> > > what
>> > > I
>> > > read in configfs documentation a while ago, I suspect it wasn't
>> > > really
>> > > designed with arbitrary hierarchies of config items in mind.  It
>> > > would
>> > > be nice if there was an API for that though.
>> > > 
>> > > Basically, a config group is expected to be a directory to
>> > > contain a
>> > > set
>> > > of config items of the same type, which is not the case on our
>> > > current
>> > > usage (i.e. the "sriov" and now the "debug" group).
>> > > 
>> > > I guess we are able to get our desired behavior (basically just a
>> > > subdirectory behaving as a config item) by not implementing the
>> > > "make_item" hook.  Not sure if that's an expected use of the API
>> > > or
>> > > if
>> > > that's more of a hack on our side and could come back to bite us
>> > > in
>> > > the
>> > > future.
>> > 
>> > Hm.. ok that's good feedback. I do like bucketing these in groups
>> > and
>> > subdirectories. It also makes it easy to show or not show based on
>> > the
>> > Kconfig. But let me go through and make sure there isn't something
>> > more
>> > suited here. Happy to hear any other feedback as well here.
>> > 
>> > Thanks,
>> > Stuart
>> > 
>> > > 
>> > > --
>> > > Gustavo Sousa
>> > > 
>> > > >  
>> > > >         mutex_init(&dev->lock);
>> > > >  
>> > > > diff --git a/drivers/gpu/drm/xe/xe_configfs_debug.c
>> > > > b/drivers/gpu/drm/xe/xe_configfs_debug.c
>> > > > new file mode 100644
>> > > > index 000000000000..45617282cec5
>> > > > --- /dev/null
>> > > > +++ b/drivers/gpu/drm/xe/xe_configfs_debug.c
>> > > > @@ -0,0 +1,14 @@
>> > > > +// SPDX-License-Identifier: MIT
>> > > > +/*
>> > > > + * Copyright © 2026 Intel Corporation
>> > > > + */
>> > > > +
>> > > > +#include <linux/configfs.h>
>> > > > +#include <linux/module.h>
>> > > > +
>> > > > +#include "xe_configfs_debug.h"
>> > > > +#include "xe_configfs_types.h"
>> > > > +
>> > > > +const struct config_item_type xe_configfs_debug_type = {
>> > > > +       .ct_owner       = THIS_MODULE,
>> > > > +};
>> > > > diff --git a/drivers/gpu/drm/xe/xe_configfs_debug.h
>> > > > b/drivers/gpu/drm/xe/xe_configfs_debug.h
>> > > > new file mode 100644
>> > > > index 000000000000..01170dc2f97e
>> > > > --- /dev/null
>> > > > +++ b/drivers/gpu/drm/xe/xe_configfs_debug.h
>> > > > @@ -0,0 +1,8 @@
>> > > > +/* SPDX-License-Identifier: MIT */
>> > > > +/*
>> > > > + * Copyright © 2026 Intel Corporation
>> > > > + */
>> > > > +#ifndef _XE_CONFIGFS_DEBUG_H_
>> > > > +#define _XE_CONFIGFS_DEBUG_H_
>> > > > +
>> > > > +#endif /* _XE_CONFIGFS_DEBUG_H_ */
>> > > > diff --git a/drivers/gpu/drm/xe/xe_configfs_types.h
>> > > > b/drivers/gpu/drm/xe/xe_configfs_types.h
>> > > > index 935097aafa96..c9d94a3c26a7 100644
>> > > > --- a/drivers/gpu/drm/xe/xe_configfs_types.h
>> > > > +++ b/drivers/gpu/drm/xe/xe_configfs_types.h
>> > > > @@ -24,6 +24,9 @@ struct wa_bb {
>> > > >  struct xe_config_group_device {
>> > > >         struct config_group group;
>> > > >         struct config_group sriov;
>> > > > +#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
>> > > > +       struct config_group debug;
>> > > > +#endif
>> > > >  
>> > > >         struct xe_config_device {
>> > > >                 struct wa_bb
>> > > > ctx_restore_mid_bb[XE_ENGINE_CLASS_MAX];
>> > > > @@ -56,4 +59,8 @@ static inline struct xe_config_device
>> > > > *xe_configfs_to_device(struct config_item
>> > > >         return &xe_configfs_to_group_device(item)->config;
>> > > >  }
>> > > >  
>> > > > +#if IS_ENABLED(CONFIG_DRM_XE_DEBUG)
>> > > > +extern const struct config_item_type xe_configfs_debug_type;
>> > > > +#endif
>> > > > +
>> > > >  #endif /* _XE_CONFIGFS_TYPES_H_ */
>> > > > -- 
>> > > > 2.43.0

  reply	other threads:[~2026-05-04 19:04 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-04  4:43 [PATCH 0/9] Add new debug infrastructure for configfs Stuart Summers
2026-05-04  4:43 ` [PATCH 1/9] drm/xe: Rename survivability_mode to enable_survivability_mode Stuart Summers
2026-05-04 13:29   ` Gustavo Sousa
2026-05-04 14:32     ` Summers, Stuart
2026-05-04 14:38       ` Summers, Stuart
2026-05-04 14:40     ` Summers, Stuart
2026-05-04 18:31     ` Rodrigo Vivi
2026-05-04 18:38       ` Summers, Stuart
2026-05-04  4:43 ` [PATCH 2/9] drm/xe: Sort xe_config_device fields and defaults alphabetically Stuart Summers
2026-05-04 13:58   ` Gustavo Sousa
2026-05-04 14:38     ` Summers, Stuart
2026-05-04 15:47   ` Lin, Shuicheng
2026-05-04 15:54     ` Summers, Stuart
2026-05-04  4:43 ` [PATCH 3/9] drm/xe: Split out configfs data structures Stuart Summers
2026-05-04  4:52   ` Summers, Stuart
2026-05-04  8:47   ` Jani Nikula
2026-05-04 14:24     ` Summers, Stuart
2026-05-04 21:48       ` Matthew Brost
2026-05-04 21:51         ` Summers, Stuart
2026-05-04  4:43 ` [PATCH 4/9] drm/xe: Add a new debug focused configfs group Stuart Summers
2026-05-04 15:42   ` Gustavo Sousa
2026-05-04 15:50     ` Summers, Stuart
2026-05-04 17:28       ` Gustavo Sousa
2026-05-04 17:44         ` Summers, Stuart
2026-05-04 19:04           ` Gustavo Sousa [this message]
2026-05-07 22:58             ` Matt Roper
2026-05-07 23:02               ` Summers, Stuart
2026-05-08 17:29                 ` Matt Roper
2026-05-08 17:55                   ` Summers, Stuart
2026-05-05 21:45       ` Summers, Stuart
2026-05-04  4:43 ` [PATCH 5/9] drm/xe: Move debug configfs entries to xe_configfs_debug.c Stuart Summers
2026-05-04  4:43 ` [PATCH 6/9] drm/xe/guc: Add configfs support for guc_log_level Stuart Summers
2026-05-05 23:54   ` Daniele Ceraolo Spurio
2026-05-04  4:43 ` [PATCH 7/9] drm/xe/guc: Add support for NPK as a GuC log target Stuart Summers
2026-05-04  4:43 ` [PATCH 8/9] drm/xe: Add infrastructure for debug configfs parameters Stuart Summers
2026-05-04  4:43 ` [PATCH 9/9] drm/xe: Migrate existing debug configfs entries to params infrastructure Stuart Summers
2026-05-04  4:54 ` [PATCH 0/9] Add new debug infrastructure for configfs Summers, Stuart
2026-05-04  5:30 ` ✗ CI.checkpatch: warning for " Patchwork
2026-05-04  5:32 ` ✓ CI.KUnit: success " Patchwork
2026-05-04  6:44 ` ✓ Xe.CI.BAT: " Patchwork
2026-05-04  8:42 ` ✗ Xe.CI.FULL: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zf2fc7ys.fsf@intel.com \
    --to=gustavo.sousa@intel.com \
    --cc=daniele.ceraolospurio@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=michal.wajdeczko@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=shuicheng.lin@intel.com \
    --cc=stuart.summers@intel.com \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.