From: Daniel Vetter <daniel@ffwll.ch>
To: "Christian König" <ckoenig.leichtzumerken@gmail.com>
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Hamza Mahfooz" <hamza.mahfooz@amd.com>,
linux-kernel@vger.kernel.org, stable@vger.kernel.org,
"Rafael J. Wysocki" <rafael@kernel.org>,
"Alex Deucher" <alexander.deucher@amd.com>,
"Christian König" <christian.koenig@amd.com>,
"Pan, Xinhui" <Xinhui.Pan@amd.com>,
"David Airlie" <airlied@gmail.com>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Mario Limonciello" <mario.limonciello@amd.com>,
"Lijo Lazar" <lijo.lazar@amd.com>,
"Srinivasan Shanmugam" <srinivasan.shanmugam@amd.com>,
"Le Ma" <le.ma@amd.com>, "André Almeida" <andrealmeid@igalia.com>,
"James Zhu" <James.Zhu@amd.com>,
"Aurabindo Pillai" <aurabindo.pillai@amd.com>,
"Alex Shi" <alexs@kernel.org>,
"Jerry Snitselaar" <jsnitsel@redhat.com>,
"Wei Liu" <wei.liu@kernel.org>,
"Robin Murphy" <robin.murphy@arm.com>,
amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
linux-pci@vger.kernel.org
Subject: Re: [PATCH 3/3] drm/amdgpu: wire up the can_remove() callback
Date: Fri, 9 Feb 2024 12:00:11 +0100 [thread overview]
Message-ID: <ZcYFu65EOaiZsSnC@phenom.ffwll.local> (raw)
In-Reply-To: <051a3088-048e-4613-9f22-8ea17f1b9736@gmail.com>
On Tue, Feb 06, 2024 at 07:42:49PM +0100, Christian König wrote:
> Am 06.02.24 um 15:29 schrieb Daniel Vetter:
> > On Fri, Feb 02, 2024 at 03:40:03PM -0800, Greg Kroah-Hartman wrote:
> > > On Fri, Feb 02, 2024 at 05:25:56PM -0500, Hamza Mahfooz wrote:
> > > > Removing an amdgpu device that still has user space references allocated
> > > > to it causes undefined behaviour.
> > > Then fix that please. There should not be anything special about your
> > > hardware that all of the tens of thousands of other devices can't handle
> > > today.
> > >
> > > What happens when I yank your device out of a system with a pci hotplug
> > > bus? You can't prevent that either, so this should not be any different
> > > at all.
> > >
> > > sorry, but please, just fix your driver.
> > fwiw Christian König from amd already rejected this too, I have no idea
> > why this was submitted
>
> Well that was my fault.
>
> I commented on an internal bug tracker that when sysfs bind/undbind is a
> different code path from PCI remove/re-scan we could try to reject it.
>
> Turned out it isn't a different code path.
Yeah it's exactly the same code, and removing the sysfs stuff means we
cant test hotunplug without physical hotunplugging stuff anymore. So
really not great - if one is buggy so is the other, and sysfs allows us to
control the timing a lot better to hit specific issues.
-Sima
> > since the very elaborate plan I developed with a
> > bunch of amd folks was to fix the various lifetime lolz we still have in
> > drm. We unfortunately export the world of internal objects to userspace as
> > uabi objects with dma_buf, dma_fence and everything else, but it's all
> > fixable and we have the plan even documented:
> >
> > https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#device-hot-unplug
> >
> > So yeah anything that isn't that plan of record is very much no-go for drm
> > drivers. Unless we change that plan of course, but that needs a
> > documentation patch first and a big discussion.
> >
> > Aside from an absolute massive pile of kernel-internal refcounting bugs
> > the really big one we agreed on after a lot of discussion is that SIGBUS
> > on dma-buf mmaps is no-go for drm drivers, because it would break way too
> > much userspace in ways which are simply not fixable (since sig handlers
> > are shared in a process, which means the gl/vk driver cannot use it).
> >
> > Otherwise it's bog standard "fix the kernel bugs" work, just a lot of it.
>
> Ignoring a few memory leaks because of messed up refcounting we actually got
> that working quite nicely.
>
> At least hot unplug / hot add seems to be working rather reliable in our
> internal testing.
>
> So it can't be that messed up.
>
> Regards,
> Christian.
>
> >
> > Cheers, Sima
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
next prev parent reply other threads:[~2024-02-09 11:00 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-02 22:25 [PATCH 1/3] driver core: bus: introduce can_remove() Hamza Mahfooz
2024-02-02 22:25 ` [PATCH 2/3] PCI: " Hamza Mahfooz
2024-02-02 23:38 ` Greg Kroah-Hartman
2024-02-02 22:25 ` [PATCH 3/3] drm/amdgpu: wire up the can_remove() callback Hamza Mahfooz
2024-02-02 22:41 ` Bjorn Helgaas
2024-02-02 23:40 ` Greg Kroah-Hartman
2024-02-06 14:29 ` Daniel Vetter
2024-02-06 18:42 ` Christian König
2024-02-09 11:00 ` Daniel Vetter [this message]
2024-02-02 23:41 ` Greg Kroah-Hartman
2024-02-02 23:38 ` [PATCH 1/3] driver core: bus: introduce can_remove() Greg Kroah-Hartman
2024-02-05 8:48 ` Christian König
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZcYFu65EOaiZsSnC@phenom.ffwll.local \
--to=daniel@ffwll.ch \
--cc=James.Zhu@amd.com \
--cc=Xinhui.Pan@amd.com \
--cc=airlied@gmail.com \
--cc=alexander.deucher@amd.com \
--cc=alexs@kernel.org \
--cc=amd-gfx@lists.freedesktop.org \
--cc=andrealmeid@igalia.com \
--cc=aurabindo.pillai@amd.com \
--cc=bhelgaas@google.com \
--cc=christian.koenig@amd.com \
--cc=ckoenig.leichtzumerken@gmail.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=gregkh@linuxfoundation.org \
--cc=hamza.mahfooz@amd.com \
--cc=jsnitsel@redhat.com \
--cc=le.ma@amd.com \
--cc=lijo.lazar@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=mario.limonciello@amd.com \
--cc=rafael@kernel.org \
--cc=robin.murphy@arm.com \
--cc=srinivasan.shanmugam@amd.com \
--cc=stable@vger.kernel.org \
--cc=wei.liu@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.