* [PATCH 1/2] drm/gem: Fix race in drm_gem_handle_create_tail()
@ 2025-07-07 15:18 Simona Vetter
2025-07-09 13:54 ` Simona Vetter
0 siblings, 1 reply; 2+ messages in thread
From: Simona Vetter @ 2025-07-07 15:18 UTC (permalink / raw)
To: DRI Development
Cc: Intel Xe Development, Simona Vetter, Jacek Lawrynowicz,
Thomas Zimmermann, stable, Maarten Lankhorst, Maxime Ripard,
David Airlie, Simona Vetter, Simona Vetter
Object creation is a careful dance where we must guarantee that the
object is fully constructed before it is visible to other threads, and
GEM buffer objects are no difference.
Final publishing happens by calling drm_gem_handle_create(). After
that the only allowed thing to do is call drm_gem_object_put() because
a concurrent call to the GEM_CLOSE ioctl with a correctly guessed id
(which is trivial since we have a linear allocator) can already tear
down the object again.
Luckily most drivers get this right, the very few exceptions I've
pinged the relevant maintainers for. Unfortunately we also need
drm_gem_handle_create() when creating additional handles for an
already existing object (e.g. GETFB ioctl or the various bo import
ioctl), and hence we cannot have a drm_gem_handle_create_and_put() as
the only exported function to stop these issues from happening.
Now unfortunately the implementation of drm_gem_handle_create() isn't
living up to standards: It does correctly finishe object
initialization at the global level, and hence is safe against a
concurrent tear down. But it also sets up the file-private aspects of
the handle, and that part goes wrong: We fully register the object in
the drm_file.object_idr before calling drm_vma_node_allow() or
obj->funcs->open, which opens up races against concurrent removal of
that handle in drm_gem_handle_delete().
Fix this with the usual two-stage approach of first reserving the
handle id, and then only registering the object after we've completed
the file-private setup.
Jacek reported this with a testcase of concurrently calling GEM_CLOSE
on a freshly-created object (which also destroys the object), but it
should be possible to hit this with just additional handles created
through import or GETFB without completed destroying the underlying
object with the concurrent GEM_CLOSE ioctl calls.
Note that the close-side of this race was fixed in f6cd7daecff5 ("drm:
Release driver references to handle before making it available
again"), which means a cool 9 years have passed until someone noticed
that we need to make this symmetry or there's still gaps left :-/
Without the 2-stage close approach we'd still have a race, therefore
that's an integral part of this bugfix.
More importantly, this means we can have NULL pointers behind
allocated id in our drm_file.object_idr. We need to check for that
now:
- drm_gem_handle_delete() checks for ERR_OR_NULL already
- drm_gem.c:object_lookup() also chekcs for NULL
- drm_gem_release() should never be called if there's another thread
still existing that could call into an IOCTL that creates a new
handle, so cannot race. For paranoia I added a NULL check to
drm_gem_object_release_handle() though.
- most drivers (etnaviv, i915, msm) are find because they use
idr_find(), which maps both ENOENT and NULL to NULL.
- drivers using idr_for_each_entry() should also be fine, because
idr_get_next does filter out NULL entries and continues the
iteration.
- The same holds for drm_show_memory_stats().
v2: Use drm_WARN_ON (Thomas)
Reported-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Tested-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: stable@vger.kernel.org
Cc: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Signed-off-by: Simona Vetter <simona.vetter@intel.com>
Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
---
drivers/gpu/drm/drm_gem.c | 10 +++++++++-
include/drm/drm_file.h | 3 +++
2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index bc505d938b3e..1aa9192c4cc6 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -316,6 +316,9 @@ drm_gem_object_release_handle(int id, void *ptr, void *data)
struct drm_file *file_priv = data;
struct drm_gem_object *obj = ptr;
+ if (drm_WARN_ON(obj->dev, !data))
+ return 0;
+
if (obj->funcs->close)
obj->funcs->close(obj, file_priv);
@@ -436,7 +439,7 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
idr_preload(GFP_KERNEL);
spin_lock(&file_priv->table_lock);
- ret = idr_alloc(&file_priv->object_idr, obj, 1, 0, GFP_NOWAIT);
+ ret = idr_alloc(&file_priv->object_idr, NULL, 1, 0, GFP_NOWAIT);
spin_unlock(&file_priv->table_lock);
idr_preload_end();
@@ -457,6 +460,11 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
goto err_revoke;
}
+ /* mirrors drm_gem_handle_delete to avoid races */
+ spin_lock(&file_priv->table_lock);
+ obj = idr_replace(&file_priv->object_idr, obj, handle);
+ WARN_ON(obj != NULL);
+ spin_unlock(&file_priv->table_lock);
*handlep = handle;
return 0;
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index eab7546aad79..115763799625 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -300,6 +300,9 @@ struct drm_file {
*
* Mapping of mm object handles to object pointers. Used by the GEM
* subsystem. Protected by @table_lock.
+ *
+ * Note that allocated entries might be NULL as a transient state when
+ * creating or deleting a handle.
*/
struct idr object_idr;
--
2.49.0
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH 1/2] drm/gem: Fix race in drm_gem_handle_create_tail()
2025-07-07 15:18 [PATCH 1/2] drm/gem: Fix race in drm_gem_handle_create_tail() Simona Vetter
@ 2025-07-09 13:54 ` Simona Vetter
0 siblings, 0 replies; 2+ messages in thread
From: Simona Vetter @ 2025-07-09 13:54 UTC (permalink / raw)
To: DRI Development
Cc: Intel Xe Development, Simona Vetter, Jacek Lawrynowicz,
Thomas Zimmermann, stable, Maarten Lankhorst, Maxime Ripard,
David Airlie, Simona Vetter, Simona Vetter
On Mon, Jul 07, 2025 at 05:18:13PM +0200, Simona Vetter wrote:
> Object creation is a careful dance where we must guarantee that the
> object is fully constructed before it is visible to other threads, and
> GEM buffer objects are no difference.
>
> Final publishing happens by calling drm_gem_handle_create(). After
> that the only allowed thing to do is call drm_gem_object_put() because
> a concurrent call to the GEM_CLOSE ioctl with a correctly guessed id
> (which is trivial since we have a linear allocator) can already tear
> down the object again.
>
> Luckily most drivers get this right, the very few exceptions I've
> pinged the relevant maintainers for. Unfortunately we also need
> drm_gem_handle_create() when creating additional handles for an
> already existing object (e.g. GETFB ioctl or the various bo import
> ioctl), and hence we cannot have a drm_gem_handle_create_and_put() as
> the only exported function to stop these issues from happening.
>
> Now unfortunately the implementation of drm_gem_handle_create() isn't
> living up to standards: It does correctly finishe object
> initialization at the global level, and hence is safe against a
> concurrent tear down. But it also sets up the file-private aspects of
> the handle, and that part goes wrong: We fully register the object in
> the drm_file.object_idr before calling drm_vma_node_allow() or
> obj->funcs->open, which opens up races against concurrent removal of
> that handle in drm_gem_handle_delete().
>
> Fix this with the usual two-stage approach of first reserving the
> handle id, and then only registering the object after we've completed
> the file-private setup.
>
> Jacek reported this with a testcase of concurrently calling GEM_CLOSE
> on a freshly-created object (which also destroys the object), but it
> should be possible to hit this with just additional handles created
> through import or GETFB without completed destroying the underlying
> object with the concurrent GEM_CLOSE ioctl calls.
>
> Note that the close-side of this race was fixed in f6cd7daecff5 ("drm:
> Release driver references to handle before making it available
> again"), which means a cool 9 years have passed until someone noticed
> that we need to make this symmetry or there's still gaps left :-/
> Without the 2-stage close approach we'd still have a race, therefore
> that's an integral part of this bugfix.
>
> More importantly, this means we can have NULL pointers behind
> allocated id in our drm_file.object_idr. We need to check for that
> now:
>
> - drm_gem_handle_delete() checks for ERR_OR_NULL already
>
> - drm_gem.c:object_lookup() also chekcs for NULL
>
> - drm_gem_release() should never be called if there's another thread
> still existing that could call into an IOCTL that creates a new
> handle, so cannot race. For paranoia I added a NULL check to
> drm_gem_object_release_handle() though.
>
> - most drivers (etnaviv, i915, msm) are find because they use
> idr_find(), which maps both ENOENT and NULL to NULL.
>
> - drivers using idr_for_each_entry() should also be fine, because
> idr_get_next does filter out NULL entries and continues the
> iteration.
>
> - The same holds for drm_show_memory_stats().
>
> v2: Use drm_WARN_ON (Thomas)
>
> Reported-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
> Tested-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: stable@vger.kernel.org
> Cc: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: David Airlie <airlied@gmail.com>
> Cc: Simona Vetter <simona@ffwll.ch>
> Signed-off-by: Simona Vetter <simona.vetter@intel.com>
> Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
Pushed to drm-misc-fixes, thanks for the reviews.
-Sima
> ---
> drivers/gpu/drm/drm_gem.c | 10 +++++++++-
> include/drm/drm_file.h | 3 +++
> 2 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index bc505d938b3e..1aa9192c4cc6 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -316,6 +316,9 @@ drm_gem_object_release_handle(int id, void *ptr, void *data)
> struct drm_file *file_priv = data;
> struct drm_gem_object *obj = ptr;
>
> + if (drm_WARN_ON(obj->dev, !data))
> + return 0;
> +
> if (obj->funcs->close)
> obj->funcs->close(obj, file_priv);
>
> @@ -436,7 +439,7 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
> idr_preload(GFP_KERNEL);
> spin_lock(&file_priv->table_lock);
>
> - ret = idr_alloc(&file_priv->object_idr, obj, 1, 0, GFP_NOWAIT);
> + ret = idr_alloc(&file_priv->object_idr, NULL, 1, 0, GFP_NOWAIT);
>
> spin_unlock(&file_priv->table_lock);
> idr_preload_end();
> @@ -457,6 +460,11 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
> goto err_revoke;
> }
>
> + /* mirrors drm_gem_handle_delete to avoid races */
> + spin_lock(&file_priv->table_lock);
> + obj = idr_replace(&file_priv->object_idr, obj, handle);
> + WARN_ON(obj != NULL);
> + spin_unlock(&file_priv->table_lock);
> *handlep = handle;
> return 0;
>
> diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
> index eab7546aad79..115763799625 100644
> --- a/include/drm/drm_file.h
> +++ b/include/drm/drm_file.h
> @@ -300,6 +300,9 @@ struct drm_file {
> *
> * Mapping of mm object handles to object pointers. Used by the GEM
> * subsystem. Protected by @table_lock.
> + *
> + * Note that allocated entries might be NULL as a transient state when
> + * creating or deleting a handle.
> */
> struct idr object_idr;
>
> --
> 2.49.0
>
--
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-07-09 13:54 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-07 15:18 [PATCH 1/2] drm/gem: Fix race in drm_gem_handle_create_tail() Simona Vetter
2025-07-09 13:54 ` Simona Vetter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox