From: Simona Vetter <simona.vetter@ffwll.ch>
To: DRI Development <dri-devel@lists.freedesktop.org>
Cc: intel-xe@lists.freedesktop.org,
Simona Vetter <simona.vetter@ffwll.ch>,
Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>,
stable@vger.kernel.org,
Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
Maxime Ripard <mripard@kernel.org>,
Thomas Zimmermann <tzimmermann@suse.de>,
David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
Simona Vetter <simona.vetter@intel.com>
Subject: Re: [PATCH 1/8] drm/gem: Fix race in drm_gem_handle_create_tail()
Date: Wed, 28 May 2025 11:26:20 +0200 [thread overview]
Message-ID: <aDbWvNUaXCCvvQkc@phenom.ffwll.local> (raw)
In-Reply-To: <20250528091307.1894940-2-simona.vetter@ffwll.ch>
On Wed, May 28, 2025 at 11:12:59AM +0200, Simona Vetter wrote:
> Object creation is a careful dance where we must guarantee that the
> object is fully constructed before it is visible to other threads, and
> GEM buffer objects are no difference.
>
> Final publishing happens by calling drm_gem_handle_create(). After
> that the only allowed thing to do is call drm_gem_object_put() because
> a concurrent call to the GEM_CLOSE ioctl with a correctly guessed id
> (which is trivial since we have a linear allocator) can already tear
> down the object again.
>
> Luckily most drivers get this right, the very few exceptions I've
> pinged the relevant maintainers for. Unfortunately we also need
> drm_gem_handle_create() when creating additional handles for an
> already existing object (e.g. GETFB ioctl or the various bo import
> ioctl), and hence we cannot have a drm_gem_handle_create_and_put() as
> the only exported function to stop these issues from happening.
>
> Now unfortunately the implementation of drm_gem_handle_create() isn't
> living up to standards: It does correctly finishe object
> initialization at the global level, and hence is safe against a
> concurrent tear down. But it also sets up the file-private aspects of
> the handle, and that part goes wrong: We fully register the object in
> the drm_file.object_idr before calling drm_vma_node_allow() or
> obj->funcs->open, which opens up races against concurrent removal of
> that handle in drm_gem_handle_delete().
>
> Fix this with the usual two-stage approach of first reserving the
> handle id, and then only registering the object after we've completed
> the file-private setup.
>
> Jacek reported this with a testcase of concurrently calling GEM_CLOSE
> on a freshly-created object (which also destroys the object), but it
> should be possible to hit this with just additional handles created
> through import or GETFB without completed destroying the underlying
> object with the concurrent GEM_CLOSE ioctl calls.
>
> Note that the close-side of this race was fixed in f6cd7daecff5 ("drm:
> Release driver references to handle before making it available
> again"), which means a cool 9 years have passed until someone noticed
> that we need to make this symmetry or there's still gaps left :-/
> Without the 2-stage close approach we'd still have a race, therefore
> that's an integral part of this bugfix.
>
> More importantly, this means we can have NULL pointers behind
> allocated id in our drm_file.object_idr. We need to check for that
> now:
>
> - drm_gem_handle_delete() checks for ERR_OR_NULL already
>
> - drm_gem.c:object_lookup() also chekcs for NULL
>
> - drm_gem_release() should never be called if there's another thread
> still existing that could call into an IOCTL that creates a new
> handle, so cannot race. For paranoia I added a NULL check to
> drm_gem_object_release_handle() though.
>
> - most drivers (etnaviv, i915, msm) are find because they use
> idr_find, which maps both ENOENT and NULL to NULL.
>
> - vmgfx is already broken vmw_debugfs_gem_info_show() because NULL
> pointers might exist due to drm_gem_handle_delete(). This needs a
> separate patch. This is because idr_for_each_entry terminates on the
> first NULL entry and so might not iterate over everything.
>
> - similar for amd in amdgpu_debugfs_gem_info_show() and
> amdgpu_gem_force_release(). The latter is really questionable though
> since it's a best effort hack and there's no way to close all the
> races. Needs separate patches.
>
> - xe is really broken because it not uses idr_for_each_entry() but
> also drops the drm_file.table_lock, which can wreak the idr iterator
> state if you're unlucky enough. Maybe another reason to look into
> the drm fdinfo memory stats instead of hand-rolling too much.
>
> - drm_show_memory_stats() is also broken since it uses
> idr_for_each_entry. But since that's a preexisting bug I'll follow
> up with a separate patch.
I've already reworded the commit message locally since I now think
idr_for_each_entry is entirely fine.
-Sima
>
> Reported-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
> Cc: stable@vger.kernel.org
> Cc: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: David Airlie <airlied@gmail.com>
> Cc: Simona Vetter <simona@ffwll.ch>
> Signed-off-by: Simona Vetter <simona.vetter@intel.com>
> Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
> ---
> drivers/gpu/drm/drm_gem.c | 10 +++++++++-
> include/drm/drm_file.h | 3 +++
> 2 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 1e659d2660f7..e4e20dda47b1 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -279,6 +279,9 @@ drm_gem_object_release_handle(int id, void *ptr, void *data)
> struct drm_file *file_priv = data;
> struct drm_gem_object *obj = ptr;
>
> + if (WARN_ON(!data))
> + return 0;
> +
> if (obj->funcs->close)
> obj->funcs->close(obj, file_priv);
>
> @@ -399,7 +402,7 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
> idr_preload(GFP_KERNEL);
> spin_lock(&file_priv->table_lock);
>
> - ret = idr_alloc(&file_priv->object_idr, obj, 1, 0, GFP_NOWAIT);
> + ret = idr_alloc(&file_priv->object_idr, NULL, 1, 0, GFP_NOWAIT);
>
> spin_unlock(&file_priv->table_lock);
> idr_preload_end();
> @@ -420,6 +423,11 @@ drm_gem_handle_create_tail(struct drm_file *file_priv,
> goto err_revoke;
> }
>
> + /* mirrors drm_gem_handle_delete to avoid races */
> + spin_lock(&file_priv->table_lock);
> + obj = idr_replace(&file_priv->object_idr, obj, handle);
> + WARN_ON(obj != NULL);
> + spin_unlock(&file_priv->table_lock);
> *handlep = handle;
> return 0;
>
> diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
> index 5c3b2aa3e69d..d344d41e6cfe 100644
> --- a/include/drm/drm_file.h
> +++ b/include/drm/drm_file.h
> @@ -300,6 +300,9 @@ struct drm_file {
> *
> * Mapping of mm object handles to object pointers. Used by the GEM
> * subsystem. Protected by @table_lock.
> + *
> + * Note that allocated entries might be NULL as a transient state when
> + * creating or deleting a handle.
> */
> struct idr object_idr;
>
> --
> 2.49.0
>
--
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
next prev parent reply other threads:[~2025-05-28 9:26 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-28 9:12 [PATCH 0/8] drm/gem: Audit around handle_create races Simona Vetter
2025-05-28 9:12 ` [PATCH 1/8] drm/gem: Fix race in drm_gem_handle_create_tail() Simona Vetter
2025-05-28 9:26 ` Simona Vetter [this message]
2025-05-28 13:20 ` Jacek Lawrynowicz
2025-06-02 15:15 ` Thomas Zimmermann
2025-06-03 11:45 ` Simona Vetter
2025-06-03 12:40 ` Thomas Zimmermann
2025-06-04 9:02 ` Simona Vetter
2025-05-28 9:13 ` [PATCH 2/8] drm/fdinfo: Switch to idr_for_each() in drm_show_memory_stats() Simona Vetter
2025-05-28 9:22 ` Simona Vetter
2025-05-28 20:10 ` kernel test robot
2025-05-28 9:13 ` [PATCH 3/8] drm/panthor: Fix UAF in panthor_gem_create_with_handle() debugfs code Simona Vetter
2025-05-29 12:31 ` kernel test robot
2025-06-01 14:06 ` Adrián Larumbe
2025-06-02 14:46 ` Simona Vetter
2025-05-28 9:13 ` [PATCH 4/8] accel/qaic: delete qaic_bo.handle Simona Vetter
2025-05-28 15:15 ` Jeff Hugo
2025-06-02 14:43 ` Simona Vetter
2025-06-03 14:43 ` Jeff Hugo
2025-06-06 16:25 ` Jeff Hugo
2025-05-28 9:13 ` [PATCH 5/8] drm/amd/kfd: Add comment about possible drm_gem_handle_create() race Simona Vetter
2025-05-28 9:13 ` [PATCH 6/8] drm/amdgpu: Add comments about drm_file.object_idr issues Simona Vetter
2025-05-28 9:22 ` Simona Vetter
2025-05-28 9:13 ` [PATCH 7/8] drm/vmwgfx: " Simona Vetter
2025-05-28 9:23 ` Simona Vetter
2025-05-28 9:13 ` [PATCH 8/8] drm/xe: " Simona Vetter
2025-05-28 9:24 ` Simona Vetter
2025-05-28 9:19 ` ✓ CI.Patch_applied: success for drm/gem: Audit around handle_create races Patchwork
2025-05-28 9:20 ` ✗ CI.checkpatch: warning " Patchwork
2025-05-28 9:21 ` ✓ CI.KUnit: success " Patchwork
2025-05-28 9:24 ` ✗ CI.Build: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aDbWvNUaXCCvvQkc@phenom.ffwll.local \
--to=simona.vetter@ffwll.ch \
--cc=airlied@gmail.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=jacek.lawrynowicz@linux.intel.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mripard@kernel.org \
--cc=simona.vetter@intel.com \
--cc=simona@ffwll.ch \
--cc=stable@vger.kernel.org \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.