From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Matthew Auld <matthew.auld@intel.com>,
<aravind.iddamsetty@linux.intel.com>,
<michal.winiarski@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>,
<intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH 2/8] drm/xe: covert sysfs over to devm
Date: Mon, 29 Apr 2024 14:45:26 -0400 [thread overview]
Message-ID: <Zi_qxhMrrlUc7hBz@intel.com> (raw)
In-Reply-To: <2b6f8692-79ad-4976-99ae-c2b227b893d9@intel.com>
On Mon, Apr 29, 2024 at 04:17:54PM +0100, Matthew Auld wrote:
> On 29/04/2024 14:52, Lucas De Marchi wrote:
> > On Mon, Apr 29, 2024 at 09:28:00AM GMT, Rodrigo Vivi wrote:
> > > On Mon, Apr 29, 2024 at 01:14:38PM +0100, Matthew Auld wrote:
> > > > Hotunplugging the device seems to result in stuff like:
> > > >
> > > > kobject_add_internal failed for tile0 with -EEXIST, don't try to
> > > > register things with the same name in the same directory.
> > > >
> > > > We only remove the sysfs as part of drmm, however that is tied to the
> > > > lifetime of the driver instance and not the device underneath. Attempt
> > > > to fix by using devm for all of the remaining sysfs stuff related to the
> > > > device.
> > >
> > > hmmm... so basically we should use the drmm only for the global module
> > > stuff and the devm for things that are per device?
> >
> > that doesn't make much sense. drmm is supposed to run when the driver
> > unbinds from the device... basically when all refcounts are gone with
> > drm_dev_put(). Are we keeping a ref we shouldn't?
>
> It's run when all refcounts are dropped for that particular drm_device, but
> that is separate from the physical device underneath (struct device). For
> example if something has an open driver fd the drmm release action is not
> going to be called until after that is also closed. But in the meantime we
> might have already removed the pci device and re-attached it to a newly
> allocated drm_device/xe_driver instance, like with hotunplug.
>
> For example, currently we don't even call basic stuff like guc_fini() etc.
> when removing the pci device, but rather when the drm_device is released,
> which sounds quite broken.
>
> So roughly drmm is for drm_device software level stuff and devm is for stuff
> that needs to happen when removing the device. See also the doc for drmm:
> https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_managed.c#L23
>
> Also: https://docs.kernel.org/gpu/drm-uapi.html#device-hot-unplug
Cc: Aravind and Michal since this likely relates to the FLR discussion...
but it looks to me that we should move more towards the devm_ and limit
the usage of drmm_ to some very specific cases...
>
> >
> > Lucas De Marchi
next prev parent reply other threads:[~2024-04-29 18:45 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-29 12:14 [PATCH 1/8] drm/xe/device_sysfs: switch over to devm Matthew Auld
2024-04-29 12:14 ` [PATCH 2/8] drm/xe: covert sysfs " Matthew Auld
2024-04-29 13:28 ` Rodrigo Vivi
2024-04-29 13:52 ` Lucas De Marchi
2024-04-29 15:17 ` Matthew Auld
2024-04-29 18:45 ` Rodrigo Vivi [this message]
2024-04-29 21:28 ` Lucas De Marchi
2024-04-30 8:43 ` Jani Nikula
2024-04-30 9:42 ` Aravind Iddamsetty
2024-04-30 10:51 ` Matthew Auld
2024-04-30 13:29 ` drmm vs devm (was Re: [PATCH 2/8] drm/xe: covert sysfs over to devm) Daniel Vetter
2024-05-06 8:07 ` [PATCH 2/8] drm/xe: covert sysfs over to devm Andrzej Hajda
2024-04-29 12:14 ` [PATCH 3/8] drm/xe/ggtt: use drm_dev_enter to mark device section Matthew Auld
2024-05-06 8:42 ` Andrzej Hajda
2024-04-29 12:14 ` [PATCH 4/8] drm/xe/guc: move guc_fini over to devm Matthew Auld
2024-05-06 9:03 ` Andrzej Hajda
2024-04-29 12:14 ` [PATCH 5/8] drm/xe/guc_pc: move pc_fini " Matthew Auld
2024-05-06 9:11 ` Andrzej Hajda
2024-04-29 12:14 ` [PATCH 6/8] drm/xe/irq: move irq_uninstall over " Matthew Auld
2024-05-06 9:12 ` Andrzej Hajda
2024-04-29 12:14 ` [PATCH 7/8] drm/xe/device: move flr " Matthew Auld
2024-05-06 9:12 ` Andrzej Hajda
2024-04-29 12:14 ` [PATCH 8/8] drm/xe/device: move xe_device_sanitize over " Matthew Auld
2024-05-06 17:25 ` Andrzej Hajda
2024-04-29 12:21 ` ✓ CI.Patch_applied: success for series starting with [1/8] drm/xe/device_sysfs: switch " Patchwork
2024-04-29 12:21 ` ✗ CI.checkpatch: warning " Patchwork
2024-04-29 12:22 ` ✓ CI.KUnit: success " Patchwork
2024-05-06 8:04 ` [PATCH 1/8] " Andrzej Hajda
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zi_qxhMrrlUc7hBz@intel.com \
--to=rodrigo.vivi@intel.com \
--cc=aravind.iddamsetty@linux.intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=matthew.auld@intel.com \
--cc=michal.winiarski@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox