From: Simona Vetter <simona.vetter@ffwll.ch>
To: "Christian König" <christian.koenig@amd.com>
Cc: Simona Vetter <simona.vetter@ffwll.ch>,
Thomas Zimmermann <tzimmermann@suse.de>,
simona@ffwll.ch, airlied@gmail.com,
torvalds@linux-foundation.org, maarten.lankhorst@linux.intel.com,
mripard@kernel.org, l.stach@pengutronix.de,
linux+etnaviv@armlinux.org.uk, kraxel@redhat.com,
christian.gmeiner@gmail.com, dmitry.osipenko@collabora.com,
gurchetansingh@chromium.org, olvaffe@gmail.com,
zack.rusin@broadcom.com, bcm-kernel-feedback-list@broadcom.com,
dri-devel@lists.freedesktop.org, etnaviv@lists.freedesktop.org,
virtualization@lists.linux.dev, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/9] Revert "drm/gem: Acquire references on GEM handles for framebuffers"
Date: Fri, 11 Jul 2025 13:22:15 +0200 [thread overview]
Message-ID: <aHDz59OYxwZzDrsG@phenom.ffwll.local> (raw)
In-Reply-To: <f46837f2-bb04-4113-8d82-1701a4d45ac3@amd.com>
On Fri, Jul 11, 2025 at 01:00:03PM +0200, Christian König wrote:
> On 11.07.25 12:08, Simona Vetter wrote:
> > On Fri, Jul 11, 2025 at 11:35:17AM +0200, Thomas Zimmermann wrote:
> >> This reverts commit 5307dce878d4126e1b375587318955bd019c3741.
> >>
> >> We're going to revert the dma-buf handle back to separating dma_buf
> >> and import_attach->dmabuf in struct drm_gem_object. Hence revert this
> >> fix for it.
> >
> > I think we should add my reasons from the private thread here why I think
> > this is conceptually wrong:
> >
> > handle_count is an uapi reference, and should have nothing to do with the
> > lifetime and consistency of the underlying gem_bo.
>
> The problem is that we tied the lifetime of the DMA-buf reference to the
> handle count and I think that is not 100% clean.
>
> The reason why that was done is to break the circle dependency GEM obj
> -> DMA-buf -> GEM obj, but what potentially should actually happen is
> that we distinct between a structure reference and an use count.
>
> E.g. similar to what is done with mm_struct and mmgrab()/mmdrop() vs
> mmget()/mmput().
Yeah, I think I'm following. The issue I see here is that I think we're
free-wheeling, dont' have enough testcases, and break existing stuff way
too much. So back to square one, start over, probably with a lot of
kerneldoc patches first and more igt and kunit tests to hit all these
issues we've (re-)discovered.
> > And for imported bo the
> > link to the dma-buf really should be invariant, and hence
> > drm_gem_object_get/put() enough. The fact that this patch seems to have
> > helped at least in some cases indicates that our assumption that we can
> > replace gem_bo->import_attach.dmabuf with gem_bo->dma_buf was wrong,
> > because pretty obviously the latter can become NULL while the gem_bo is
> > still alive. Which means this was conceptually wrong and at best helped
> > hide a race condition somewhere.
> >
> > This means that unlike the claim in the reverted commit that 1a148af06000
> > ("drm/gem-shmem: Use dma_buf from GEM object instance") started triggering
> > an existing condition the much more likely explanation is that it
> > introduced the regression itself. And hence we need to revert this entire
> > chain of commits.
>
> The existing condition is still a problem I think. We ran into issues with that multiple times already.
>
> Just imagine the following scenario:
> 1. GEM obj is exported, DMA-buf file descriptor created
> 2. GEM obj is used in a FB.
> 3. GEM obj is closed, handle_count goes from 1->0, DMA-buf reference is dropped, but file descriptor remains open, obj->dma_buf set to NULL
> 4. Userspace calls DRM_IOCTL_MODE_GETFB2, handle count goes 0->1 again, but obj->dma_buf is still NULL!
> 5. GEM obj is exported again, second DMA-buf is created.
>
> The first time I stumbled over that it took me a week to figure out why
> we can have two DMA-bufs for the same GEM obj. Especially you can
> trigger the "WARN_ON(obj->dma_buf != dma_buf);" in
> drm_gem_prime_fd_to_handle() with this.
>
> For my particular use case it was just a broken unit test, but it allows
> userspace to mess up the kernel objects quite a bit and that is really
> not good.
Yeah that's not good, but I think something we should sort out with adding
testcases first and figuring out fixes in -next, not in late -rc kernels.
I think the minimal fix for this corner case would be to add a flag to
gem_bo that they've had GETFB/2 called on them, and in that case disable
that WARN_ON and just quietly bail out instead. Userspace gets to keep the
pieces.
I'm not sure whether making the entire lifetime stuff even more
complicated by elevating handle_count to a more general usage count is the
right approach.
Cheers, Sima
>
> Regards,
> Christian.
>
> >
> > I'll also add all the Fixes: lines as needed when merging these to
> > drm-fixes, since some of the patches reverted in this series have landed
> > in 6.15 already.
> >
> > I plan to merge them all to drm-fixes once intel-gfx-ci has approved it
> > all.
> >
> > Thanks, Sima
> >
> >> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> >> ---
> >> drivers/gpu/drm/drm_gem.c | 44 ++------------------
> >> drivers/gpu/drm/drm_gem_framebuffer_helper.c | 16 ++++---
> >> drivers/gpu/drm/drm_internal.h | 2 -
> >> 3 files changed, 11 insertions(+), 51 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> >> index 3a99e4a5d303..db44c40e307f 100644
> >> --- a/drivers/gpu/drm/drm_gem.c
> >> +++ b/drivers/gpu/drm/drm_gem.c
> >> @@ -213,35 +213,6 @@ void drm_gem_private_object_fini(struct drm_gem_object *obj)
> >> }
> >> EXPORT_SYMBOL(drm_gem_private_object_fini);
> >>
> >> -static void drm_gem_object_handle_get(struct drm_gem_object *obj)
> >> -{
> >> - struct drm_device *dev = obj->dev;
> >> -
> >> - drm_WARN_ON(dev, !mutex_is_locked(&dev->object_name_lock));
> >> -
> >> - if (obj->handle_count++ == 0)
> >> - drm_gem_object_get(obj);
> >> -}
> >> -
> >> -/**
> >> - * drm_gem_object_handle_get_unlocked - acquire reference on user-space handles
> >> - * @obj: GEM object
> >> - *
> >> - * Acquires a reference on the GEM buffer object's handle. Required
> >> - * to keep the GEM object alive. Call drm_gem_object_handle_put_unlocked()
> >> - * to release the reference.
> >> - */
> >> -void drm_gem_object_handle_get_unlocked(struct drm_gem_object *obj)
> >> -{
> >> - struct drm_device *dev = obj->dev;
> >> -
> >> - guard(mutex)(&dev->object_name_lock);
> >> -
> >> - drm_WARN_ON(dev, !obj->handle_count); /* first ref taken in create-tail helper */
> >> - drm_gem_object_handle_get(obj);
> >> -}
> >> -EXPORT_SYMBOL(drm_gem_object_handle_get_unlocked);
> >> -
> >> /**
> >> * drm_gem_object_handle_free - release resources bound to userspace handles
> >> * @obj: GEM object to clean up.
> >> @@ -272,14 +243,8 @@ static void drm_gem_object_exported_dma_buf_free(struct drm_gem_object *obj)
> >> }
> >> }
> >>
> >> -/**
> >> - * drm_gem_object_handle_put_unlocked - releases reference on user-space handles
> >> - * @obj: GEM object
> >> - *
> >> - * Releases a reference on the GEM buffer object's handle. Possibly releases
> >> - * the GEM buffer object and associated dma-buf objects.
> >> - */
> >> -void drm_gem_object_handle_put_unlocked(struct drm_gem_object *obj)
> >> +static void
> >> +drm_gem_object_handle_put_unlocked(struct drm_gem_object *obj)
> >> {
> >> struct drm_device *dev = obj->dev;
> >> bool final = false;
> >> @@ -304,7 +269,6 @@ void drm_gem_object_handle_put_unlocked(struct drm_gem_object *obj)
> >> if (final)
> >> drm_gem_object_put(obj);
> >> }
> >> -EXPORT_SYMBOL(drm_gem_object_handle_put_unlocked);
> >>
> >> /*
> >> * Called at device or object close to release the file's
> >> @@ -429,8 +393,8 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
> >> int ret;
> >>
> >> WARN_ON(!mutex_is_locked(&dev->object_name_lock));
> >> -
> >> - drm_gem_object_handle_get(obj);
> >> + if (obj->handle_count++ == 0)
> >> + drm_gem_object_get(obj);
> >>
> >> /*
> >> * Get the user-visible handle using idr. Preload and perform
> >> diff --git a/drivers/gpu/drm/drm_gem_framebuffer_helper.c b/drivers/gpu/drm/drm_gem_framebuffer_helper.c
> >> index c60d0044d036..618ce725cd75 100644
> >> --- a/drivers/gpu/drm/drm_gem_framebuffer_helper.c
> >> +++ b/drivers/gpu/drm/drm_gem_framebuffer_helper.c
> >> @@ -100,7 +100,7 @@ void drm_gem_fb_destroy(struct drm_framebuffer *fb)
> >> unsigned int i;
> >>
> >> for (i = 0; i < fb->format->num_planes; i++)
> >> - drm_gem_object_handle_put_unlocked(fb->obj[i]);
> >> + drm_gem_object_put(fb->obj[i]);
> >>
> >> drm_framebuffer_cleanup(fb);
> >> kfree(fb);
> >> @@ -183,10 +183,8 @@ int drm_gem_fb_init_with_funcs(struct drm_device *dev,
> >> if (!objs[i]) {
> >> drm_dbg_kms(dev, "Failed to lookup GEM object\n");
> >> ret = -ENOENT;
> >> - goto err_gem_object_handle_put_unlocked;
> >> + goto err_gem_object_put;
> >> }
> >> - drm_gem_object_handle_get_unlocked(objs[i]);
> >> - drm_gem_object_put(objs[i]);
> >>
> >> min_size = (height - 1) * mode_cmd->pitches[i]
> >> + drm_format_info_min_pitch(info, i, width)
> >> @@ -196,22 +194,22 @@ int drm_gem_fb_init_with_funcs(struct drm_device *dev,
> >> drm_dbg_kms(dev,
> >> "GEM object size (%zu) smaller than minimum size (%u) for plane %d\n",
> >> objs[i]->size, min_size, i);
> >> - drm_gem_object_handle_put_unlocked(objs[i]);
> >> + drm_gem_object_put(objs[i]);
> >> ret = -EINVAL;
> >> - goto err_gem_object_handle_put_unlocked;
> >> + goto err_gem_object_put;
> >> }
> >> }
> >>
> >> ret = drm_gem_fb_init(dev, fb, mode_cmd, objs, i, funcs);
> >> if (ret)
> >> - goto err_gem_object_handle_put_unlocked;
> >> + goto err_gem_object_put;
> >>
> >> return 0;
> >>
> >> -err_gem_object_handle_put_unlocked:
> >> +err_gem_object_put:
> >> while (i > 0) {
> >> --i;
> >> - drm_gem_object_handle_put_unlocked(objs[i]);
> >> + drm_gem_object_put(objs[i]);
> >> }
> >> return ret;
> >> }
> >> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
> >> index f921cc73f8b8..9078504e789c 100644
> >> --- a/drivers/gpu/drm/drm_internal.h
> >> +++ b/drivers/gpu/drm/drm_internal.h
> >> @@ -161,8 +161,6 @@ void drm_sysfs_lease_event(struct drm_device *dev);
> >>
> >> /* drm_gem.c */
> >> int drm_gem_init(struct drm_device *dev);
> >> -void drm_gem_object_handle_get_unlocked(struct drm_gem_object *obj);
> >> -void drm_gem_object_handle_put_unlocked(struct drm_gem_object *obj);
> >> int drm_gem_handle_create_tail(struct drm_file *file_priv,
> >> struct drm_gem_object *obj,
> >> u32 *handlep);
> >> --
> >> 2.50.0
> >>
> >
>
--
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
next prev parent reply other threads:[~2025-07-11 11:22 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-11 9:35 [PATCH 0/9] drm: Revert general use of struct drm_gem_object.dma_buf Thomas Zimmermann
2025-07-11 9:35 ` [PATCH 1/9] Revert "drm/framebuffer: Acquire internal references on GEM handles" Thomas Zimmermann
2025-07-11 9:35 ` [PATCH 2/9] Revert "drm/gem: Acquire references on GEM handles for framebuffers" Thomas Zimmermann
2025-07-11 10:08 ` Simona Vetter
2025-07-11 11:00 ` Christian König
2025-07-11 11:22 ` Simona Vetter [this message]
2025-07-11 9:35 ` [PATCH 3/9] Revert "drm/virtio: Use dma_buf from GEM object instance" Thomas Zimmermann
2025-07-11 11:29 ` Dmitry Osipenko
2025-07-11 11:31 ` Simona Vetter
2025-07-11 11:49 ` Dmitry Osipenko
2025-07-11 12:01 ` Thomas Zimmermann
2025-07-11 12:15 ` Dmitry Osipenko
2025-07-11 9:35 ` [PATCH 4/9] Revert "drm/vmwgfx: " Thomas Zimmermann
2025-07-11 9:35 ` [PATCH 5/9] Revert "drm/etnaviv: " Thomas Zimmermann
2025-07-11 9:35 ` [PATCH 6/9] Revert "drm/prime: " Thomas Zimmermann
2025-07-11 9:35 ` [PATCH 7/9] Revert "drm/gem-framebuffer: " Thomas Zimmermann
2025-07-11 9:35 ` [PATCH 8/9] Revert "drm/gem-shmem: " Thomas Zimmermann
2025-07-11 9:35 ` [PATCH 9/9] Revert "drm/gem-dma: " Thomas Zimmermann
2025-07-11 10:32 ` [PATCH 0/9] drm: Revert general use of struct drm_gem_object.dma_buf Christian König
2025-07-11 11:26 ` Simona Vetter
2025-07-11 15:48 ` Linus Torvalds
2025-07-11 16:41 ` Thomas Zimmermann
2025-07-11 17:35 ` Linus Torvalds
2025-07-11 18:37 ` Linus Torvalds
2025-07-11 21:52 ` Simona Vetter
2025-07-14 12:39 ` Simona Vetter
2025-07-15 7:41 ` Thomas Zimmermann
2025-07-15 13:07 ` Simona Vetter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aHDz59OYxwZzDrsG@phenom.ffwll.local \
--to=simona.vetter@ffwll.ch \
--cc=airlied@gmail.com \
--cc=bcm-kernel-feedback-list@broadcom.com \
--cc=christian.gmeiner@gmail.com \
--cc=christian.koenig@amd.com \
--cc=dmitry.osipenko@collabora.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=etnaviv@lists.freedesktop.org \
--cc=gurchetansingh@chromium.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=kraxel@redhat.com \
--cc=l.stach@pengutronix.de \
--cc=linux+etnaviv@armlinux.org.uk \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mripard@kernel.org \
--cc=olvaffe@gmail.com \
--cc=simona@ffwll.ch \
--cc=torvalds@linux-foundation.org \
--cc=tzimmermann@suse.de \
--cc=virtualization@lists.linux.dev \
--cc=zack.rusin@broadcom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox