* [PATCH v6] Add CRIU support for amdgpu dmabuf @ 2025-06-13 18:23 David Francis 2025-06-13 18:23 ` [PATCH 1/3] drm: Add DRM prime interfaces to reassign GEM handle David Francis ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: David Francis @ 2025-06-13 18:23 UTC (permalink / raw) To: dri-devel Cc: tvrtko.ursulin, Felix.Kuehling, David.YatSin, Chris.Freehill, Christian.Koenig, dcostantino, sruffell, simona, mripard, tzimmermann This patch series adds support for CRIU checkpointing of processes that share memory with the amdgpu dmabuf interface. This v6 cleans up the locking and moves some of the mapping info code into amdgpu_vm.c. The mapping flags code is a placeholder awaiting the mapping flags rework. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/3] drm: Add DRM prime interfaces to reassign GEM handle 2025-06-13 18:23 [PATCH v6] Add CRIU support for amdgpu dmabuf David Francis @ 2025-06-13 18:23 ` David Francis 2025-06-16 8:51 ` Christian König 2025-06-13 18:23 ` [PATCH 2/3] drm/amdgpu: Adding amdgpu CRIU ioctl David Francis 2025-06-13 18:23 ` [PATCH 3/3] drm/amdgpu: Allow kfd CRIU with no buffer objects David Francis 2 siblings, 1 reply; 10+ messages in thread From: David Francis @ 2025-06-13 18:23 UTC (permalink / raw) To: dri-devel Cc: tvrtko.ursulin, Felix.Kuehling, David.YatSin, Chris.Freehill, Christian.Koenig, dcostantino, sruffell, simona, mripard, tzimmermann, David Francis CRIU restore of drm buffer objects requires the ability to create or import a buffer object with a specific gem handle. Add new drm ioctl DRM_IOCTL_GEM_CHANGE_HANDLE, which takes the gem handle of an object and moves that object to a specified new gem handle. This ioctl needs to call drm_prime_remove_buf_handle, but that function acquires the prime lock, which the ioctl needs to hold for other purposes. Make drm_prime_remove_buf_handle not acquire the prime lock, and change its other caller to reflect this. Signed-off-by: David Francis <David.Francis@amd.com> --- drivers/gpu/drm/drm_gem.c | 56 ++++++++++++++++++++++++++++++++++ drivers/gpu/drm/drm_internal.h | 4 +++ drivers/gpu/drm/drm_ioctl.c | 1 + drivers/gpu/drm/drm_prime.c | 6 +--- include/uapi/drm/drm.h | 17 +++++++++++ 5 files changed, 79 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index c6240bab3fa5..31fe2f1d1137 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -282,7 +282,12 @@ drm_gem_object_release_handle(int id, void *ptr, void *data) if (obj->funcs->close) obj->funcs->close(obj, file_priv); + mutex_lock(&file_priv->prime.lock); + drm_prime_remove_buf_handle(&file_priv->prime, id); + + mutex_unlock(&file_priv->prime.lock); + drm_vma_node_revoke(&obj->vma_node, file_priv); drm_gem_object_handle_put_unlocked(obj); @@ -888,6 +893,57 @@ drm_gem_flink_ioctl(struct drm_device *dev, void *data, return ret; } +/** + * drm_gem_open_ioctl - implementation of the GEM_CHANGE_HANDLE ioctl + * @dev: drm_device + * @data: ioctl data + * @file_priv: drm file-private structure + * + * Change the handle of a GEM object to the specified one. + * The new handle must be unused. On success the old handle is closed + * and all further IOCTL should refer to the new handle only. + * Calls to DRM_IOCTL_PRIME_FD_TO_HANDLE will return the new handle. + */ +int drm_gem_change_handle_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct drm_gem_change_handle *args = data; + struct drm_gem_object *obj; + int ret; + + obj = drm_gem_object_lookup(file_priv, args->handle); + if (!obj) + return -ENOENT; + + if (args->handle == args->new_handle) + return 0; + + mutex_lock(&file_priv->prime.lock); + spin_lock(&file_priv->table_lock); + + ret = idr_alloc(&file_priv->object_idr, obj, args->new_handle, args->new_handle + 1, GFP_NOWAIT); + if (ret < 0) + goto out_unlock_table; + + spin_unlock(&file_priv->table_lock); + + ret = drm_prime_add_buf_handle(&file_priv->prime, obj->dma_buf, args->new_handle); + if (ret < 0) + goto out_unlock_prime; + + drm_prime_remove_buf_handle(&file_priv->prime, args->handle); + + spin_lock(&file_priv->table_lock); + idr_remove(&file_priv->object_idr, args->handle); + +out_unlock_table: + spin_unlock(&file_priv->table_lock); +out_unlock_prime: + mutex_unlock(&file_priv->prime.lock); + + return ret; +} + /** * drm_gem_open_ioctl - implementation of the GEM_OPEN ioctl * @dev: drm_device diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h index b2b6a8e49dda..e9d5cdf7e033 100644 --- a/drivers/gpu/drm/drm_internal.h +++ b/drivers/gpu/drm/drm_internal.h @@ -85,6 +85,8 @@ int drm_prime_fd_to_handle_ioctl(struct drm_device *dev, void *data, void drm_prime_init_file_private(struct drm_prime_file_private *prime_fpriv); void drm_prime_destroy_file_private(struct drm_prime_file_private *prime_fpriv); +int drm_prime_add_buf_handle(struct drm_prime_file_private *prime_fpriv, + struct dma_buf *dma_buf, uint32_t handle); void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv, uint32_t handle); @@ -168,6 +170,8 @@ int drm_gem_close_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int drm_gem_flink_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); +int drm_gem_change_handle_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv); int drm_gem_open_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); void drm_gem_open(struct drm_device *dev, struct drm_file *file_private); diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index f593dc569d31..d8a24875a7ba 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -653,6 +653,7 @@ static const struct drm_ioctl_desc drm_ioctls[] = { DRM_IOCTL_DEF(DRM_IOCTL_GEM_CLOSE, drm_gem_close_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_GEM_FLINK, drm_gem_flink_ioctl, DRM_AUTH), DRM_IOCTL_DEF(DRM_IOCTL_GEM_OPEN, drm_gem_open_ioctl, DRM_AUTH), + DRM_IOCTL_DEF(DRM_IOCTL_GEM_CHANGE_HANDLE, drm_gem_change_handle_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETRESOURCES, drm_mode_getresources, 0), diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index bdb51c8f262e..1f2e858e5000 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -93,7 +93,7 @@ struct drm_prime_member { struct rb_node handle_rb; }; -static int drm_prime_add_buf_handle(struct drm_prime_file_private *prime_fpriv, +int drm_prime_add_buf_handle(struct drm_prime_file_private *prime_fpriv, struct dma_buf *dma_buf, uint32_t handle) { struct drm_prime_member *member; @@ -190,8 +190,6 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv, { struct rb_node *rb; - mutex_lock(&prime_fpriv->lock); - rb = prime_fpriv->handles.rb_node; while (rb) { struct drm_prime_member *member; @@ -210,8 +208,6 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv, rb = rb->rb_left; } } - - mutex_unlock(&prime_fpriv->lock); } void drm_prime_init_file_private(struct drm_prime_file_private *prime_fpriv) diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index 7fba37b94401..84c819c171d2 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -625,6 +625,15 @@ struct drm_gem_open { __u64 size; }; +/* DRM_IOCTL_GEM_CHANGE_HANDLE ioctl argument type */ +struct drm_gem_change_handle { + /** Current handle of object */ + __u32 handle; + + /** Handle to change that object to */ + __u32 new_handle; +}; + /** * DRM_CAP_DUMB_BUFFER * @@ -1305,6 +1314,14 @@ extern "C" { */ #define DRM_IOCTL_SET_CLIENT_NAME DRM_IOWR(0xD1, struct drm_set_client_name) +/** + * DRM_IOCTL_GEM_CHANGE_HANDLE - Move an object to a different handle + * + * Some applications (notably CRIU) need objects to have specific gem handles. + * This ioctl changes the object at one gem handle to use a new gem handle. + */ +#define DRM_IOCTL_GEM_CHANGE_HANDLE DRM_IOWR(0xD2, struct drm_gem_change_handle) + /* * Device specific ioctls should only be in their respective headers * The device specific ioctl range is from 0x40 to 0x9f. -- 2.34.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] drm: Add DRM prime interfaces to reassign GEM handle 2025-06-13 18:23 ` [PATCH 1/3] drm: Add DRM prime interfaces to reassign GEM handle David Francis @ 2025-06-16 8:51 ` Christian König 0 siblings, 0 replies; 10+ messages in thread From: Christian König @ 2025-06-16 8:51 UTC (permalink / raw) To: David Francis, dri-devel Cc: tvrtko.ursulin, Felix.Kuehling, David.YatSin, Chris.Freehill, dcostantino, sruffell, simona, mripard, tzimmermann On 6/13/25 20:23, David Francis wrote: > CRIU restore of drm buffer objects requires the ability to create > or import a buffer object with a specific gem handle. > > Add new drm ioctl DRM_IOCTL_GEM_CHANGE_HANDLE, which takes > the gem handle of an object and moves that object to a > specified new gem handle. > > This ioctl needs to call drm_prime_remove_buf_handle, > but that function acquires the prime lock, which the ioctl > needs to hold for other purposes. > > Make drm_prime_remove_buf_handle not acquire the prime lock, > and change its other caller to reflect this. > > Signed-off-by: David Francis <David.Francis@amd.com> > --- > drivers/gpu/drm/drm_gem.c | 56 ++++++++++++++++++++++++++++++++++ > drivers/gpu/drm/drm_internal.h | 4 +++ > drivers/gpu/drm/drm_ioctl.c | 1 + > drivers/gpu/drm/drm_prime.c | 6 +--- > include/uapi/drm/drm.h | 17 +++++++++++ > 5 files changed, 79 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c > index c6240bab3fa5..31fe2f1d1137 100644 > --- a/drivers/gpu/drm/drm_gem.c > +++ b/drivers/gpu/drm/drm_gem.c > @@ -282,7 +282,12 @@ drm_gem_object_release_handle(int id, void *ptr, void *data) > if (obj->funcs->close) > obj->funcs->close(obj, file_priv); > > + mutex_lock(&file_priv->prime.lock); > + > drm_prime_remove_buf_handle(&file_priv->prime, id); > + > + mutex_unlock(&file_priv->prime.lock); > + > drm_vma_node_revoke(&obj->vma_node, file_priv); > > drm_gem_object_handle_put_unlocked(obj); > @@ -888,6 +893,57 @@ drm_gem_flink_ioctl(struct drm_device *dev, void *data, > return ret; > } > > +/** > + * drm_gem_open_ioctl - implementation of the GEM_CHANGE_HANDLE ioctl > + * @dev: drm_device > + * @data: ioctl data > + * @file_priv: drm file-private structure > + * > + * Change the handle of a GEM object to the specified one. > + * The new handle must be unused. On success the old handle is closed > + * and all further IOCTL should refer to the new handle only. > + * Calls to DRM_IOCTL_PRIME_FD_TO_HANDLE will return the new handle. > + */ > +int drm_gem_change_handle_ioctl(struct drm_device *dev, void *data, > + struct drm_file *file_priv) > +{ > + struct drm_gem_change_handle *args = data; > + struct drm_gem_object *obj; > + int ret; > + > + obj = drm_gem_object_lookup(file_priv, args->handle); > + if (!obj) > + return -ENOENT; > + > + if (args->handle == args->new_handle) > + return 0; > + > + mutex_lock(&file_priv->prime.lock); > + spin_lock(&file_priv->table_lock); > + > + ret = idr_alloc(&file_priv->object_idr, obj, args->new_handle, args->new_handle + 1, GFP_NOWAIT); > + if (ret < 0) > + goto out_unlock_table; > + > + spin_unlock(&file_priv->table_lock); > + > + ret = drm_prime_add_buf_handle(&file_priv->prime, obj->dma_buf, args->new_handle); > + if (ret < 0) > + goto out_unlock_prime; > + > + drm_prime_remove_buf_handle(&file_priv->prime, args->handle); We need to put this into an "if (obj->dma_buf)", exporting a GEM handle as DMA-buf is only optional. @Sima any objections to this? Or can you think of a better approach? Thanks, Christian. > + > + spin_lock(&file_priv->table_lock); > + idr_remove(&file_priv->object_idr, args->handle); > + > +out_unlock_table: > + spin_unlock(&file_priv->table_lock); > +out_unlock_prime: > + mutex_unlock(&file_priv->prime.lock); > + > + return ret; > +} > + > /** > * drm_gem_open_ioctl - implementation of the GEM_OPEN ioctl > * @dev: drm_device > diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h > index b2b6a8e49dda..e9d5cdf7e033 100644 > --- a/drivers/gpu/drm/drm_internal.h > +++ b/drivers/gpu/drm/drm_internal.h > @@ -85,6 +85,8 @@ int drm_prime_fd_to_handle_ioctl(struct drm_device *dev, void *data, > > void drm_prime_init_file_private(struct drm_prime_file_private *prime_fpriv); > void drm_prime_destroy_file_private(struct drm_prime_file_private *prime_fpriv); > +int drm_prime_add_buf_handle(struct drm_prime_file_private *prime_fpriv, > + struct dma_buf *dma_buf, uint32_t handle); > void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv, > uint32_t handle); > > @@ -168,6 +170,8 @@ int drm_gem_close_ioctl(struct drm_device *dev, void *data, > struct drm_file *file_priv); > int drm_gem_flink_ioctl(struct drm_device *dev, void *data, > struct drm_file *file_priv); > +int drm_gem_change_handle_ioctl(struct drm_device *dev, void *data, > + struct drm_file *file_priv); > int drm_gem_open_ioctl(struct drm_device *dev, void *data, > struct drm_file *file_priv); > void drm_gem_open(struct drm_device *dev, struct drm_file *file_private); > diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c > index f593dc569d31..d8a24875a7ba 100644 > --- a/drivers/gpu/drm/drm_ioctl.c > +++ b/drivers/gpu/drm/drm_ioctl.c > @@ -653,6 +653,7 @@ static const struct drm_ioctl_desc drm_ioctls[] = { > DRM_IOCTL_DEF(DRM_IOCTL_GEM_CLOSE, drm_gem_close_ioctl, DRM_RENDER_ALLOW), > DRM_IOCTL_DEF(DRM_IOCTL_GEM_FLINK, drm_gem_flink_ioctl, DRM_AUTH), > DRM_IOCTL_DEF(DRM_IOCTL_GEM_OPEN, drm_gem_open_ioctl, DRM_AUTH), > + DRM_IOCTL_DEF(DRM_IOCTL_GEM_CHANGE_HANDLE, drm_gem_change_handle_ioctl, DRM_RENDER_ALLOW), > > DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETRESOURCES, drm_mode_getresources, 0), > > diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c > index bdb51c8f262e..1f2e858e5000 100644 > --- a/drivers/gpu/drm/drm_prime.c > +++ b/drivers/gpu/drm/drm_prime.c > @@ -93,7 +93,7 @@ struct drm_prime_member { > struct rb_node handle_rb; > }; > > -static int drm_prime_add_buf_handle(struct drm_prime_file_private *prime_fpriv, > +int drm_prime_add_buf_handle(struct drm_prime_file_private *prime_fpriv, > struct dma_buf *dma_buf, uint32_t handle) > { > struct drm_prime_member *member; > @@ -190,8 +190,6 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv, > { > struct rb_node *rb; > > - mutex_lock(&prime_fpriv->lock); > - > rb = prime_fpriv->handles.rb_node; > while (rb) { > struct drm_prime_member *member; > @@ -210,8 +208,6 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv, > rb = rb->rb_left; > } > } > - > - mutex_unlock(&prime_fpriv->lock); > } > > void drm_prime_init_file_private(struct drm_prime_file_private *prime_fpriv) > diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h > index 7fba37b94401..84c819c171d2 100644 > --- a/include/uapi/drm/drm.h > +++ b/include/uapi/drm/drm.h > @@ -625,6 +625,15 @@ struct drm_gem_open { > __u64 size; > }; > > +/* DRM_IOCTL_GEM_CHANGE_HANDLE ioctl argument type */ > +struct drm_gem_change_handle { > + /** Current handle of object */ > + __u32 handle; > + > + /** Handle to change that object to */ > + __u32 new_handle; > +}; > + > /** > * DRM_CAP_DUMB_BUFFER > * > @@ -1305,6 +1314,14 @@ extern "C" { > */ > #define DRM_IOCTL_SET_CLIENT_NAME DRM_IOWR(0xD1, struct drm_set_client_name) > > +/** > + * DRM_IOCTL_GEM_CHANGE_HANDLE - Move an object to a different handle > + * > + * Some applications (notably CRIU) need objects to have specific gem handles. > + * This ioctl changes the object at one gem handle to use a new gem handle. > + */ > +#define DRM_IOCTL_GEM_CHANGE_HANDLE DRM_IOWR(0xD2, struct drm_gem_change_handle) > + > /* > * Device specific ioctls should only be in their respective headers > * The device specific ioctl range is from 0x40 to 0x9f. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/3] drm/amdgpu: Adding amdgpu CRIU ioctl 2025-06-13 18:23 [PATCH v6] Add CRIU support for amdgpu dmabuf David Francis 2025-06-13 18:23 ` [PATCH 1/3] drm: Add DRM prime interfaces to reassign GEM handle David Francis @ 2025-06-13 18:23 ` David Francis 2025-06-16 9:01 ` Christian König 2025-06-13 18:23 ` [PATCH 3/3] drm/amdgpu: Allow kfd CRIU with no buffer objects David Francis 2 siblings, 1 reply; 10+ messages in thread From: David Francis @ 2025-06-13 18:23 UTC (permalink / raw) To: dri-devel Cc: tvrtko.ursulin, Felix.Kuehling, David.YatSin, Chris.Freehill, Christian.Koenig, dcostantino, sruffell, simona, mripard, tzimmermann, David Francis amdgpu CRIU requires an amdgpu CRIU ioctl. This ioctl has a similar interface to the amdkfd CRIU ioctl. The objects that can be checkpointed and restored are bos and vm mappings. Because a single amdgpu bo can have multiple mappings. the mappings are recorded separately. The ioctl has two modes: PROCESS_INFO, which sends to the user how many bos and vms to expect, and CHECKPOINT, which copies data about bos and vms into user-provided buffers. Restore is handled using existing amdgpu and drm ioctls. The new ioctl lives in a new file amdgpu_criu.c with its own header amdgpu_criu.h Signed-off-by: David Francis <David.Francis@amd.com> --- drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c | 234 +++++++++++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h | 34 ++++ drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 + drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 11 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 10 + drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 + include/uapi/drm/amdgpu_drm.h | 62 ++++++ 8 files changed, 357 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile index 87080c06e5fc..0863edcdd03f 100644 --- a/drivers/gpu/drm/amd/amdgpu/Makefile +++ b/drivers/gpu/drm/amd/amdgpu/Makefile @@ -63,7 +63,7 @@ amdgpu-y += amdgpu_device.o amdgpu_doorbell_mgr.o amdgpu_kms.o \ amdgpu_xgmi.o amdgpu_csa.o amdgpu_ras.o amdgpu_vm_cpu.o \ amdgpu_vm_sdma.o amdgpu_discovery.o amdgpu_ras_eeprom.o amdgpu_nbio.o \ amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \ - amdgpu_fw_attestation.o amdgpu_securedisplay.o \ + amdgpu_fw_attestation.o amdgpu_securedisplay.o amdgpu_criu.o \ amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \ amdgpu_ring_mux.o amdgpu_xcp.o amdgpu_seq64.o amdgpu_aca.o amdgpu_dev_coredump.o \ amdgpu_cper.o amdgpu_userq_fence.o amdgpu_eviction_fence.o diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c new file mode 100644 index 000000000000..8141ab09698c --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c @@ -0,0 +1,234 @@ +/* SPDX-License-Identifier: MIT */ +/* +* Copyright 2025 Advanced Micro Devices, Inc. +* +* Permission is hereby granted, free of charge, to any person obtaining a +* copy of this software and associated documentation files (the "Software"), +* to deal in the Software without restriction, including without limitation +* the rights to use, copy, modify, merge, publish, distribute, sublicense, +* and/or sell copies of the Software, and to permit persons to whom the +* Software is furnished to do so, subject to the following conditions: +* +* The above copyright notice and this permission notice shall be included in +* all copies or substantial portions of the Software. +* +* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +* THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR +* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, +* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR +* OTHER DEALINGS IN THE SOFTWARE. +*/ + +#include <linux/dma-buf.h> +#include <linux/hashtable.h> +#include <linux/mutex.h> +#include <linux/random.h> + +#include <drm/amdgpu_drm.h> +#include <drm/drm_device.h> +#include <drm/drm_file.h> + +#include "amdgpu_criu.h" + +#include <drm/amdgpu_drm.h> +#include <drm/drm_drv.h> +#include <drm/drm_exec.h> +#include <drm/drm_gem_ttm_helper.h> +#include <drm/ttm/ttm_tt.h> +#include <linux/interval_tree_generic.h> + +#include "amdgpu.h" +#include "amdgpu_display.h" +#include "amdgpu_dma_buf.h" +#include "amdgpu_hmm.h" +#include "amdgpu_xgmi.h" + +static uint32_t hardware_flags_to_uapi_flags(struct amdgpu_device *adev, uint64_t pte_flags) +{ + uint32_t gem_flags = 0; + + //This function will be replaced by the mapping flags rework + + if (pte_flags & AMDGPU_PTE_EXECUTABLE) + gem_flags |= AMDGPU_VM_PAGE_EXECUTABLE; + if (pte_flags & AMDGPU_PTE_READABLE) + gem_flags |= AMDGPU_VM_PAGE_READABLE; + if (pte_flags & AMDGPU_PTE_WRITEABLE) + gem_flags |= AMDGPU_VM_PAGE_WRITEABLE; + if (pte_flags & AMDGPU_PTE_PRT_FLAG(adev)) + gem_flags |= AMDGPU_VM_PAGE_PRT; + if (pte_flags & AMDGPU_PTE_NOALLOC) + gem_flags |= AMDGPU_VM_PAGE_NOALLOC; + + return gem_flags; +} + + +/** + * amdgpu_criu_bo_info_ioctl - get information about a process' buffer objects + * + * @dev: drm device pointer + * @data: drm_amdgpu_criu_bo_info_args + * @filp: drm file pointer + * + * num_bos is set as an input to the size of the bo_buckets array. + * num_bos is sent back as output as the number of bos in the process. + * If that number is larger than the size of the array, the ioctl must + * be retried. + * + * Returns: + * 0 for success, -errno for errors. + */ +int amdgpu_criu_bo_info_ioctl(struct drm_device *dev, void *data, + struct drm_file *filp) +{ + struct drm_amdgpu_criu_bo_info_args *args = data; + struct drm_amdgpu_criu_bo_bucket *bo_buckets; + struct drm_gem_object *gobj; + int id, ret = 0; + int bo_index = 0; + int num_bos = 0; + + spin_lock(&filp->table_lock); + idr_for_each_entry(&filp->object_idr, gobj, id) + num_bos += 1; + spin_unlock(&filp->table_lock); + + if (args->num_bos < num_bos) { + args->num_bos = num_bos; + goto exit; + } + args->num_bos = num_bos; + if (num_bos == 0) { + goto exit; + } + + bo_buckets = kvzalloc(num_bos * sizeof(*bo_buckets), GFP_KERNEL); + if (!bo_buckets) { + ret = -ENOMEM; + goto free_buckets; + } + + spin_lock(&filp->table_lock); + idr_for_each_entry(&filp->object_idr, gobj, id) { + struct amdgpu_bo *bo = gem_to_amdgpu_bo(gobj); + struct drm_amdgpu_criu_bo_bucket *bo_bucket; + + bo_bucket = &bo_buckets[bo_index]; + + bo_bucket->size = amdgpu_bo_size(bo); + bo_bucket->offset = amdgpu_bo_mmap_offset(bo); + bo_bucket->alloc_flags = bo->flags & (~AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE); + bo_bucket->preferred_domains = bo->preferred_domains; + bo_bucket->gem_handle = id; + + if (bo->tbo.base.import_attach) + bo_bucket->flags |= AMDGPU_CRIU_BO_FLAG_IS_IMPORT; + + bo_index += 1; + } + spin_unlock(&filp->table_lock); + + ret = copy_to_user((void __user *)args->bo_buckets, bo_buckets, num_bos * sizeof(*bo_buckets)); + if (ret) { + pr_debug("Failed to copy BO information to user\n"); + ret = -EFAULT; + } + +free_buckets: + kvfree(bo_buckets); +exit: + + return ret; +} + +/** + * amdgpu_criu_bo_info_ioctl - get information about a process' buffer objects + * + * @dev: drm device pointer + * @data: drm_amdgpu_criu_mapping_info_args + * @filp: drm file pointer + * + * num_mappings is set as an input to the size of the vm_buckets array. + * num_mappings is sent back as output as the number of mappings the bo has. + * If that number is larger than the size of the array, the ioctl must + * be retried. + * + * Returns: + * 0 for success, -errno for errors. + */ +int amdgpu_criu_mapping_info_ioctl(struct drm_device *dev, void *data, + struct drm_file *filp) +{ + struct drm_amdgpu_criu_mapping_info_args *args = data; + struct drm_gem_object *gobj = idr_find(&filp->object_idr, args->gem_handle); + struct amdgpu_vm *avm = &((struct amdgpu_fpriv *)filp->driver_priv)->vm; + struct amdgpu_bo *bo = gem_to_amdgpu_bo(gobj); + struct amdgpu_bo_va *bo_va = amdgpu_vm_bo_find(avm, bo); + struct amdgpu_fpriv *fpriv = filp->driver_priv; + struct drm_amdgpu_criu_vm_bucket *vm_buckets; + struct amdgpu_bo_va_mapping *mapping; + struct drm_exec exec; + int vm_index = 0; + int ret = 0; + + vm_buckets = kvzalloc(args->num_mappings * sizeof(*vm_buckets), GFP_KERNEL); + if (!vm_buckets) { + ret = -ENOMEM; + goto free_vms; + } + + drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT | + DRM_EXEC_IGNORE_DUPLICATES, 0); + drm_exec_until_all_locked(&exec) { + if (gobj) { + ret = drm_exec_lock_obj(&exec, gobj); + drm_exec_retry_on_contention(&exec); + if (ret) + goto unlock_exec; + } + + ret = amdgpu_vm_lock_pd(&fpriv->vm, &exec, 2); + drm_exec_retry_on_contention(&exec); + if (ret) + goto unlock_exec; + } + + amdgpu_vm_it_for_each_entry(avm, mapping, 0, U64_MAX) { + if (mapping->bo_va == bo_va) { + if (vm_index < args->num_mappings) { + vm_buckets[vm_index].start = mapping->start; + vm_buckets[vm_index].last = mapping->last; + vm_buckets[vm_index].offset = mapping->offset; + vm_buckets[vm_index].flags = hardware_flags_to_uapi_flags(drm_to_adev(dev), mapping->flags); + } + vm_index += 1; + } + } + + drm_exec_fini(&exec); + + if (vm_index > 0) { + if (vm_index <= args->num_mappings) { + ret = copy_to_user((void __user *)args->vm_buckets, vm_buckets, vm_index * sizeof(*vm_buckets)); + if (ret) { + pr_debug("Failed to copy BO information to user\n"); + ret = -EFAULT; + } + } + } + args->num_mappings = vm_index; + + + kvfree(vm_buckets); + + return ret; +unlock_exec: + drm_exec_fini(&exec); +free_vms: + kvfree(vm_buckets); + + return ret; +} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h new file mode 100644 index 000000000000..9c196973ed0f --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: MIT */ +/* +* Copyright 2025 Advanced Micro Devices, Inc. +* +* Permission is hereby granted, free of charge, to any person obtaining a +* copy of this software and associated documentation files (the "Software"), +* to deal in the Software without restriction, including without limitation +* the rights to use, copy, modify, merge, publish, distribute, sublicense, +* and/or sell copies of the Software, and to permit persons to whom the +* Software is furnished to do so, subject to the following conditions: +* +* The above copyright notice and this permission notice shall be included in +* all copies or substantial portions of the Software. +* +* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +* THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR +* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, +* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR +* OTHER DEALINGS IN THE SOFTWARE. +*/ + +#ifndef __AMDGPU_CRIU_H__ +#define __AMDGPU_CRIU_H__ + +#include <drm/amdgpu_drm.h> + +int amdgpu_criu_bo_info_ioctl(struct drm_device *dev, void *data, + struct drm_file *filp); +int amdgpu_criu_mapping_info_ioctl(struct drm_device *dev, void *data, + struct drm_file *filp); + +#endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 4db92e0a60da..5f3de93a665d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -53,6 +53,7 @@ #include "amdgpu_xgmi.h" #include "amdgpu_userq.h" #include "amdgpu_userq_fence.h" +#include "amdgpu_criu.h" #include "../amdxcp/amdgpu_xcp_drv.h" /* @@ -3021,6 +3022,8 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = { DRM_IOCTL_DEF_DRV(AMDGPU_USERQ, amdgpu_userq_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(AMDGPU_USERQ_SIGNAL, amdgpu_userq_signal_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(AMDGPU_USERQ_WAIT, amdgpu_userq_wait_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(AMDGPU_CRIU_BO_INFO, amdgpu_criu_bo_info_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(AMDGPU_CRIU_MAPPING_INFO, amdgpu_criu_mapping_info_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), }; static const struct drm_driver amdgpu_kms_driver = { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 3911c78f8282..f803908cf46d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -3156,3 +3156,14 @@ bool amdgpu_vm_is_bo_always_valid(struct amdgpu_vm *vm, struct amdgpu_bo *bo) { return bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv; } + + +struct amdgpu_bo_va_mapping *amdgpu_vm_it_first_mapping_in_range(struct amdgpu_vm *avm, uint64_t start, uint64_t end) +{ + return amdgpu_vm_it_iter_first(&avm->va, start, end); +} + +struct amdgpu_bo_va_mapping *amdgpu_vm_it_next_mapping_in_range(struct amdgpu_bo_va_mapping *mapping, uint64_t start, uint64_t end) +{ + return amdgpu_vm_it_iter_next(mapping, start, end); +} \ No newline at end of file diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index f3ad687125ad..cd7d3940cc7a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -668,4 +668,14 @@ void amdgpu_vm_tlb_fence_create(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct dma_fence **fence); +struct amdgpu_bo_va_mapping *amdgpu_vm_it_first_mapping_in_range( + struct amdgpu_vm *avm, uint64_t start, uint64_t end); +struct amdgpu_bo_va_mapping *amdgpu_vm_it_next_mapping_in_range( + struct amdgpu_bo_va_mapping *mapping, uint64_t start, uint64_t end); + +#define amdgpu_vm_it_for_each_entry(avm, mapping, start, end) \ + for (mapping = amdgpu_vm_it_first_mapping_in_range(avm, start, end); \ + mapping; \ + mapping = amdgpu_vm_it_next_mapping_in_range(mapping, start, end)) + #endif diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index a2149afa5803..a8cf2d4580cc 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -45,6 +45,8 @@ #include "amdgpu_dma_buf.h" #include "kfd_debug.h" +#include "amdgpu_criu.h" + static long kfd_ioctl(struct file *, unsigned int, unsigned long); static int kfd_open(struct inode *, struct file *); static int kfd_release(struct inode *, struct file *); diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h index 45c4fa13499c..16aee825e116 100644 --- a/include/uapi/drm/amdgpu_drm.h +++ b/include/uapi/drm/amdgpu_drm.h @@ -57,6 +57,10 @@ extern "C" { #define DRM_AMDGPU_USERQ 0x16 #define DRM_AMDGPU_USERQ_SIGNAL 0x17 #define DRM_AMDGPU_USERQ_WAIT 0x18 +#define DRM_AMDGPU_CRIU_OP 0x19 + +#define DRM_AMDGPU_CRIU_BO_INFO 0x20 +#define DRM_AMDGPU_CRIU_MAPPING_INFO 0x21 #define DRM_IOCTL_AMDGPU_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create) #define DRM_IOCTL_AMDGPU_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap) @@ -77,6 +81,10 @@ extern "C" { #define DRM_IOCTL_AMDGPU_USERQ DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_USERQ, union drm_amdgpu_userq) #define DRM_IOCTL_AMDGPU_USERQ_SIGNAL DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_USERQ_SIGNAL, struct drm_amdgpu_userq_signal) #define DRM_IOCTL_AMDGPU_USERQ_WAIT DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_USERQ_WAIT, struct drm_amdgpu_userq_wait) +#define DRM_IOCTL_AMDGPU_CRIU_OP DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_CRIU_OP, struct drm_amdgpu_criu_args) + +#define DRM_IOCTL_AMDGPU_CRIU_BO_INFO DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_CRIU_BO_INFO, struct drm_amdgpu_criu_bo_info_args) +#define DRM_IOCTL_AMDGPU_CRIU_MAPPING_INFO DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_CRIU_MAPPING_INFO, struct drm_amdgpu_criu_mapping_info_args) /** * DOC: memory domains @@ -1626,6 +1634,60 @@ struct drm_color_ctm_3x4 { __u64 matrix[12]; }; +#define AMDGPU_CRIU_BO_FLAG_IS_IMPORT (1 << 0) + +struct drm_amdgpu_criu_bo_info_args { + /* IN: Size of bo_buckets buffer. OUT: Number of bos in process (if larger than size of buffer, must retry) */ + __u32 num_bos; + + /* User pointer to array of drm_amdgpu_criu_bo_bucket */ + __u64 bo_buckets; +}; + +struct drm_amdgpu_criu_bo_bucket { + /* Size of bo */ + __u64 size; + + /* Offset of bo in device file */ + __u64 offset; + + /* GEM_CREATE flags for re-creation of buffer */ + __u64 alloc_flags; + + /* Pending how to handle this; provides information needed to remake the buffer on restore */ + __u32 preferred_domains; + + /* Currently just one flag: IS_IMPORT */ + __u32 flags; + + __u32 gem_handle; +}; + +struct drm_amdgpu_criu_mapping_info_args { + /* Handle of bo to get mappings of */ + __u32 gem_handle; + + /* IN: Size of vm_buckets buffer. OUT: Number of bos in process (if larger than size of buffer, must retry) */ + __u32 num_mappings; + + /* User pointer to array of drm_amdgpu_criu_vm_bucket */ + __u64 vm_buckets; +}; + +struct drm_amdgpu_criu_vm_bucket { + /* Start of mapping (in number of pages) */ + __u64 start; + + /* End of mapping (in number of pages) */ + __u64 last; + + /* Mapping offset */ + __u64 offset; + + /* flags needed to recreate mapping; still pending how to get these */ + __u64 flags; +}; + #if defined(__cplusplus) } #endif -- 2.34.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] drm/amdgpu: Adding amdgpu CRIU ioctl 2025-06-13 18:23 ` [PATCH 2/3] drm/amdgpu: Adding amdgpu CRIU ioctl David Francis @ 2025-06-16 9:01 ` Christian König 2025-06-16 14:15 ` Francis, David 0 siblings, 1 reply; 10+ messages in thread From: Christian König @ 2025-06-16 9:01 UTC (permalink / raw) To: David Francis, dri-devel Cc: tvrtko.ursulin, Felix.Kuehling, David.YatSin, Chris.Freehill, dcostantino, sruffell, simona, mripard, tzimmermann On 6/13/25 20:23, David Francis wrote: > amdgpu CRIU requires an amdgpu CRIU ioctl. This ioctl > has a similar interface to the amdkfd CRIU ioctl. > > The objects that can be checkpointed and restored are bos and vm > mappings. Because a single amdgpu bo can have multiple mappings. > the mappings are recorded separately. > > The ioctl has two modes: PROCESS_INFO, which sends to the user > how many bos and vms to expect, and CHECKPOINT, which copies > data about bos and vms into user-provided buffers. > > Restore is handled using existing amdgpu and drm ioctls. > > The new ioctl lives in a new file amdgpu_criu.c with its own > header amdgpu_criu.h > > Signed-off-by: David Francis <David.Francis@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c | 234 +++++++++++++++++++++++ > drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h | 34 ++++ > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 + > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 11 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 10 + > drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 + > include/uapi/drm/amdgpu_drm.h | 62 ++++++ > 8 files changed, 357 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c > create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h > > diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile > index 87080c06e5fc..0863edcdd03f 100644 > --- a/drivers/gpu/drm/amd/amdgpu/Makefile > +++ b/drivers/gpu/drm/amd/amdgpu/Makefile > @@ -63,7 +63,7 @@ amdgpu-y += amdgpu_device.o amdgpu_doorbell_mgr.o amdgpu_kms.o \ > amdgpu_xgmi.o amdgpu_csa.o amdgpu_ras.o amdgpu_vm_cpu.o \ > amdgpu_vm_sdma.o amdgpu_discovery.o amdgpu_ras_eeprom.o amdgpu_nbio.o \ > amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \ > - amdgpu_fw_attestation.o amdgpu_securedisplay.o \ > + amdgpu_fw_attestation.o amdgpu_securedisplay.o amdgpu_criu.o \ > amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \ > amdgpu_ring_mux.o amdgpu_xcp.o amdgpu_seq64.o amdgpu_aca.o amdgpu_dev_coredump.o \ > amdgpu_cper.o amdgpu_userq_fence.o amdgpu_eviction_fence.o > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c > new file mode 100644 > index 000000000000..8141ab09698c > --- /dev/null > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c > @@ -0,0 +1,234 @@ > +/* SPDX-License-Identifier: MIT */ > +/* > +* Copyright 2025 Advanced Micro Devices, Inc. > +* > +* Permission is hereby granted, free of charge, to any person obtaining a > +* copy of this software and associated documentation files (the "Software"), > +* to deal in the Software without restriction, including without limitation > +* the rights to use, copy, modify, merge, publish, distribute, sublicense, > +* and/or sell copies of the Software, and to permit persons to whom the > +* Software is furnished to do so, subject to the following conditions: > +* > +* The above copyright notice and this permission notice shall be included in > +* all copies or substantial portions of the Software. > +* > +* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > +* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > +* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > +* THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > +* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > +* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > +* OTHER DEALINGS IN THE SOFTWARE. > +*/ > + > +#include <linux/dma-buf.h> > +#include <linux/hashtable.h> > +#include <linux/mutex.h> > +#include <linux/random.h> > + > +#include <drm/amdgpu_drm.h> > +#include <drm/drm_device.h> > +#include <drm/drm_file.h> > + > +#include "amdgpu_criu.h" > + > +#include <drm/amdgpu_drm.h> > +#include <drm/drm_drv.h> > +#include <drm/drm_exec.h> > +#include <drm/drm_gem_ttm_helper.h> > +#include <drm/ttm/ttm_tt.h> > +#include <linux/interval_tree_generic.h> > + > +#include "amdgpu.h" > +#include "amdgpu_display.h" > +#include "amdgpu_dma_buf.h" > +#include "amdgpu_hmm.h" > +#include "amdgpu_xgmi.h" > + > +static uint32_t hardware_flags_to_uapi_flags(struct amdgpu_device *adev, uint64_t pte_flags) > +{ > + uint32_t gem_flags = 0; > + > + //This function will be replaced by the mapping flags rework > + > + if (pte_flags & AMDGPU_PTE_EXECUTABLE) > + gem_flags |= AMDGPU_VM_PAGE_EXECUTABLE; > + if (pte_flags & AMDGPU_PTE_READABLE) > + gem_flags |= AMDGPU_VM_PAGE_READABLE; > + if (pte_flags & AMDGPU_PTE_WRITEABLE) > + gem_flags |= AMDGPU_VM_PAGE_WRITEABLE; > + if (pte_flags & AMDGPU_PTE_PRT_FLAG(adev)) > + gem_flags |= AMDGPU_VM_PAGE_PRT; > + if (pte_flags & AMDGPU_PTE_NOALLOC) > + gem_flags |= AMDGPU_VM_PAGE_NOALLOC; > + > + return gem_flags; > +} > + > + > +/** > + * amdgpu_criu_bo_info_ioctl - get information about a process' buffer objects > + * > + * @dev: drm device pointer > + * @data: drm_amdgpu_criu_bo_info_args > + * @filp: drm file pointer > + * > + * num_bos is set as an input to the size of the bo_buckets array. > + * num_bos is sent back as output as the number of bos in the process. > + * If that number is larger than the size of the array, the ioctl must > + * be retried. > + * > + * Returns: > + * 0 for success, -errno for errors. > + */ > +int amdgpu_criu_bo_info_ioctl(struct drm_device *dev, void *data, > + struct drm_file *filp) > +{ > + struct drm_amdgpu_criu_bo_info_args *args = data; > + struct drm_amdgpu_criu_bo_bucket *bo_buckets; > + struct drm_gem_object *gobj; > + int id, ret = 0; > + int bo_index = 0; > + int num_bos = 0; > + > + spin_lock(&filp->table_lock); > + idr_for_each_entry(&filp->object_idr, gobj, id) > + num_bos += 1; > + spin_unlock(&filp->table_lock); > + > + if (args->num_bos < num_bos) { > + args->num_bos = num_bos; > + goto exit; > + } > + args->num_bos = num_bos; > + if (num_bos == 0) { > + goto exit; > + } > + > + bo_buckets = kvzalloc(num_bos * sizeof(*bo_buckets), GFP_KERNEL); > + if (!bo_buckets) { > + ret = -ENOMEM; > + goto free_buckets; > + } > + > + spin_lock(&filp->table_lock); > + idr_for_each_entry(&filp->object_idr, gobj, id) { > + struct amdgpu_bo *bo = gem_to_amdgpu_bo(gobj); > + struct drm_amdgpu_criu_bo_bucket *bo_bucket; > + > + bo_bucket = &bo_buckets[bo_index]; > + > + bo_bucket->size = amdgpu_bo_size(bo); > + bo_bucket->offset = amdgpu_bo_mmap_offset(bo); Not all BOs have a MMAP offset. Additional to that we already have an IOCTL to query the mmap offset which does the appropriate checks and doesn't potentially crash when there is none. > + bo_bucket->alloc_flags = bo->flags & (~AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE); > + bo_bucket->preferred_domains = bo->preferred_domains; > + bo_bucket->gem_handle = id; > + > + if (bo->tbo.base.import_attach) > + bo_bucket->flags |= AMDGPU_CRIU_BO_FLAG_IS_IMPORT; > + > + bo_index += 1; > + } > + spin_unlock(&filp->table_lock); > + > + ret = copy_to_user((void __user *)args->bo_buckets, bo_buckets, num_bos * sizeof(*bo_buckets)); > + if (ret) { > + pr_debug("Failed to copy BO information to user\n"); > + ret = -EFAULT; > + } > + > +free_buckets: > + kvfree(bo_buckets); > +exit: > + > + return ret; > +} > + > +/** > + * amdgpu_criu_bo_info_ioctl - get information about a process' buffer objects > + * > + * @dev: drm device pointer > + * @data: drm_amdgpu_criu_mapping_info_args > + * @filp: drm file pointer > + * > + * num_mappings is set as an input to the size of the vm_buckets array. > + * num_mappings is sent back as output as the number of mappings the bo has. > + * If that number is larger than the size of the array, the ioctl must > + * be retried. > + * > + * Returns: > + * 0 for success, -errno for errors. > + */ > +int amdgpu_criu_mapping_info_ioctl(struct drm_device *dev, void *data, > + struct drm_file *filp) Please split adding that into a separate patch. > +{ > + struct drm_amdgpu_criu_mapping_info_args *args = data; > + struct drm_gem_object *gobj = idr_find(&filp->object_idr, args->gem_handle); > + struct amdgpu_vm *avm = &((struct amdgpu_fpriv *)filp->driver_priv)->vm; > + struct amdgpu_bo *bo = gem_to_amdgpu_bo(gobj); > + struct amdgpu_bo_va *bo_va = amdgpu_vm_bo_find(avm, bo); > + struct amdgpu_fpriv *fpriv = filp->driver_priv; > + struct drm_amdgpu_criu_vm_bucket *vm_buckets; > + struct amdgpu_bo_va_mapping *mapping; > + struct drm_exec exec; > + int vm_index = 0; That needs a better name. > + int ret = 0; Don't initialize return variables if you don't need it. That is very bad practice. > + > + vm_buckets = kvzalloc(args->num_mappings * sizeof(*vm_buckets), GFP_KERNEL); > + if (!vm_buckets) { > + ret = -ENOMEM; > + goto free_vms; > + } > + > + drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT | > + DRM_EXEC_IGNORE_DUPLICATES, 0); > + drm_exec_until_all_locked(&exec) { > + if (gobj) { > + ret = drm_exec_lock_obj(&exec, gobj); > + drm_exec_retry_on_contention(&exec); > + if (ret) > + goto unlock_exec; > + } > + > + ret = amdgpu_vm_lock_pd(&fpriv->vm, &exec, 2); > + drm_exec_retry_on_contention(&exec); > + if (ret) > + goto unlock_exec; > + } > + > + amdgpu_vm_it_for_each_entry(avm, mapping, 0, U64_MAX) { > + if (mapping->bo_va == bo_va) { > + if (vm_index < args->num_mappings) { > + vm_buckets[vm_index].start = mapping->start; > + vm_buckets[vm_index].last = mapping->last; > + vm_buckets[vm_index].offset = mapping->offset; > + vm_buckets[vm_index].flags = hardware_flags_to_uapi_flags(drm_to_adev(dev), mapping->flags); > + } > + vm_index += 1; > + } > + } This chunk should go into amdgpu_vm.c And I strongly suggest to not go over the mapping rb tree but rather the list in the bo_va. Regards, Christian. > + > + drm_exec_fini(&exec); > + > + if (vm_index > 0) { > + if (vm_index <= args->num_mappings) { > + ret = copy_to_user((void __user *)args->vm_buckets, vm_buckets, vm_index * sizeof(*vm_buckets)); > + if (ret) { > + pr_debug("Failed to copy BO information to user\n"); > + ret = -EFAULT; > + } > + } > + } > + args->num_mappings = vm_index; > + > + > + kvfree(vm_buckets); > + > + return ret; > +unlock_exec: > + drm_exec_fini(&exec); > +free_vms: > + kvfree(vm_buckets); > + > + return ret; > +} > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h > new file mode 100644 > index 000000000000..9c196973ed0f > --- /dev/null > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h > @@ -0,0 +1,34 @@ > +/* SPDX-License-Identifier: MIT */ > +/* > +* Copyright 2025 Advanced Micro Devices, Inc. > +* > +* Permission is hereby granted, free of charge, to any person obtaining a > +* copy of this software and associated documentation files (the "Software"), > +* to deal in the Software without restriction, including without limitation > +* the rights to use, copy, modify, merge, publish, distribute, sublicense, > +* and/or sell copies of the Software, and to permit persons to whom the > +* Software is furnished to do so, subject to the following conditions: > +* > +* The above copyright notice and this permission notice shall be included in > +* all copies or substantial portions of the Software. > +* > +* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > +* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > +* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > +* THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > +* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > +* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > +* OTHER DEALINGS IN THE SOFTWARE. > +*/ > + > +#ifndef __AMDGPU_CRIU_H__ > +#define __AMDGPU_CRIU_H__ > + > +#include <drm/amdgpu_drm.h> > + > +int amdgpu_criu_bo_info_ioctl(struct drm_device *dev, void *data, > + struct drm_file *filp); > +int amdgpu_criu_mapping_info_ioctl(struct drm_device *dev, void *data, > + struct drm_file *filp); > + > +#endif > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > index 4db92e0a60da..5f3de93a665d 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > @@ -53,6 +53,7 @@ > #include "amdgpu_xgmi.h" > #include "amdgpu_userq.h" > #include "amdgpu_userq_fence.h" > +#include "amdgpu_criu.h" > #include "../amdxcp/amdgpu_xcp_drv.h" > > /* > @@ -3021,6 +3022,8 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = { > DRM_IOCTL_DEF_DRV(AMDGPU_USERQ, amdgpu_userq_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), > DRM_IOCTL_DEF_DRV(AMDGPU_USERQ_SIGNAL, amdgpu_userq_signal_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), > DRM_IOCTL_DEF_DRV(AMDGPU_USERQ_WAIT, amdgpu_userq_wait_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), > + DRM_IOCTL_DEF_DRV(AMDGPU_CRIU_BO_INFO, amdgpu_criu_bo_info_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), > + DRM_IOCTL_DEF_DRV(AMDGPU_CRIU_MAPPING_INFO, amdgpu_criu_mapping_info_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), > }; > > static const struct drm_driver amdgpu_kms_driver = { > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index 3911c78f8282..f803908cf46d 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -3156,3 +3156,14 @@ bool amdgpu_vm_is_bo_always_valid(struct amdgpu_vm *vm, struct amdgpu_bo *bo) > { > return bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv; > } > + > + > +struct amdgpu_bo_va_mapping *amdgpu_vm_it_first_mapping_in_range(struct amdgpu_vm *avm, uint64_t start, uint64_t end) > +{ > + return amdgpu_vm_it_iter_first(&avm->va, start, end); > +} > + > +struct amdgpu_bo_va_mapping *amdgpu_vm_it_next_mapping_in_range(struct amdgpu_bo_va_mapping *mapping, uint64_t start, uint64_t end) > +{ > + return amdgpu_vm_it_iter_next(mapping, start, end); > +} > \ No newline at end of file > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > index f3ad687125ad..cd7d3940cc7a 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > @@ -668,4 +668,14 @@ void amdgpu_vm_tlb_fence_create(struct amdgpu_device *adev, > struct amdgpu_vm *vm, > struct dma_fence **fence); > > +struct amdgpu_bo_va_mapping *amdgpu_vm_it_first_mapping_in_range( > + struct amdgpu_vm *avm, uint64_t start, uint64_t end); > +struct amdgpu_bo_va_mapping *amdgpu_vm_it_next_mapping_in_range( > + struct amdgpu_bo_va_mapping *mapping, uint64_t start, uint64_t end); > + > +#define amdgpu_vm_it_for_each_entry(avm, mapping, start, end) \ > + for (mapping = amdgpu_vm_it_first_mapping_in_range(avm, start, end); \ > + mapping; \ > + mapping = amdgpu_vm_it_next_mapping_in_range(mapping, start, end)) > + > #endif > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > index a2149afa5803..a8cf2d4580cc 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > @@ -45,6 +45,8 @@ > #include "amdgpu_dma_buf.h" > #include "kfd_debug.h" > > +#include "amdgpu_criu.h" > + > static long kfd_ioctl(struct file *, unsigned int, unsigned long); > static int kfd_open(struct inode *, struct file *); > static int kfd_release(struct inode *, struct file *); > diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h > index 45c4fa13499c..16aee825e116 100644 > --- a/include/uapi/drm/amdgpu_drm.h > +++ b/include/uapi/drm/amdgpu_drm.h > @@ -57,6 +57,10 @@ extern "C" { > #define DRM_AMDGPU_USERQ 0x16 > #define DRM_AMDGPU_USERQ_SIGNAL 0x17 > #define DRM_AMDGPU_USERQ_WAIT 0x18 > +#define DRM_AMDGPU_CRIU_OP 0x19 > + > +#define DRM_AMDGPU_CRIU_BO_INFO 0x20 > +#define DRM_AMDGPU_CRIU_MAPPING_INFO 0x21 > > #define DRM_IOCTL_AMDGPU_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create) > #define DRM_IOCTL_AMDGPU_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap) > @@ -77,6 +81,10 @@ extern "C" { > #define DRM_IOCTL_AMDGPU_USERQ DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_USERQ, union drm_amdgpu_userq) > #define DRM_IOCTL_AMDGPU_USERQ_SIGNAL DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_USERQ_SIGNAL, struct drm_amdgpu_userq_signal) > #define DRM_IOCTL_AMDGPU_USERQ_WAIT DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_USERQ_WAIT, struct drm_amdgpu_userq_wait) > +#define DRM_IOCTL_AMDGPU_CRIU_OP DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_CRIU_OP, struct drm_amdgpu_criu_args) > + > +#define DRM_IOCTL_AMDGPU_CRIU_BO_INFO DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_CRIU_BO_INFO, struct drm_amdgpu_criu_bo_info_args) > +#define DRM_IOCTL_AMDGPU_CRIU_MAPPING_INFO DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_CRIU_MAPPING_INFO, struct drm_amdgpu_criu_mapping_info_args) > > /** > * DOC: memory domains > @@ -1626,6 +1634,60 @@ struct drm_color_ctm_3x4 { > __u64 matrix[12]; > }; > > +#define AMDGPU_CRIU_BO_FLAG_IS_IMPORT (1 << 0) > + > +struct drm_amdgpu_criu_bo_info_args { > + /* IN: Size of bo_buckets buffer. OUT: Number of bos in process (if larger than size of buffer, must retry) */ > + __u32 num_bos; > + > + /* User pointer to array of drm_amdgpu_criu_bo_bucket */ > + __u64 bo_buckets; > +}; > + > +struct drm_amdgpu_criu_bo_bucket { > + /* Size of bo */ > + __u64 size; > + > + /* Offset of bo in device file */ > + __u64 offset; > + > + /* GEM_CREATE flags for re-creation of buffer */ > + __u64 alloc_flags; > + > + /* Pending how to handle this; provides information needed to remake the buffer on restore */ > + __u32 preferred_domains; > + > + /* Currently just one flag: IS_IMPORT */ > + __u32 flags; > + > + __u32 gem_handle; > +}; > + > +struct drm_amdgpu_criu_mapping_info_args { > + /* Handle of bo to get mappings of */ > + __u32 gem_handle; > + > + /* IN: Size of vm_buckets buffer. OUT: Number of bos in process (if larger than size of buffer, must retry) */ > + __u32 num_mappings; > + > + /* User pointer to array of drm_amdgpu_criu_vm_bucket */ > + __u64 vm_buckets; > +}; > + > +struct drm_amdgpu_criu_vm_bucket { > + /* Start of mapping (in number of pages) */ > + __u64 start; > + > + /* End of mapping (in number of pages) */ > + __u64 last; > + > + /* Mapping offset */ > + __u64 offset; > + > + /* flags needed to recreate mapping; still pending how to get these */ > + __u64 flags; > +}; > + > #if defined(__cplusplus) > } > #endif ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] drm/amdgpu: Adding amdgpu CRIU ioctl 2025-06-16 9:01 ` Christian König @ 2025-06-16 14:15 ` Francis, David 0 siblings, 0 replies; 10+ messages in thread From: Francis, David @ 2025-06-16 14:15 UTC (permalink / raw) To: Koenig, Christian, dri-devel@lists.freedesktop.org Cc: tvrtko.ursulin@igalia.com, Kuehling, Felix, Yat Sin, David, Freehill, Chris, dcostantino@meta.com, sruffell@meta.com, simona@ffwll.ch, mripard@kernel.org, tzimmermann@suse.de [-- Attachment #1: Type: text/plain, Size: 22177 bytes --] [AMD Official Use Only - AMD Internal Distribution Only] > > + amdgpu_vm_it_for_each_entry(avm, mapping, 0, U64_MAX) { > > + if (mapping->bo_va == bo_va) { > > + if (vm_index < args->num_mappings) { > > + vm_buckets[vm_index].start = mapping->start; > > + vm_buckets[vm_index].last = mapping->last; > > + vm_buckets[vm_index].offset = mapping->offset; > > + vm_buckets[vm_index].flags = hardware_flags_to_uapi_flags(drm_to_adev(dev), mapping->flags); > > + } > > + vm_index += 1; > > + } > > + } > > This chunk should go into amdgpu_vm.c And I strongly suggest to not go over the mapping rb tree but rather the list in the bo_va. > Do you mean the invalids and valids lists? I remember you having objections to using those in a previous patch. David ________________________________ From: Koenig, Christian <Christian.Koenig@amd.com> Sent: Monday, June 16, 2025 5:01 AM To: Francis, David <David.Francis@amd.com>; dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org> Cc: tvrtko.ursulin@igalia.com <tvrtko.ursulin@igalia.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Yat Sin, David <David.YatSin@amd.com>; Freehill, Chris <Chris.Freehill@amd.com>; dcostantino@meta.com <dcostantino@meta.com>; sruffell@meta.com <sruffell@meta.com>; simona@ffwll.ch <simona@ffwll.ch>; mripard@kernel.org <mripard@kernel.org>; tzimmermann@suse.de <tzimmermann@suse.de> Subject: Re: [PATCH 2/3] drm/amdgpu: Adding amdgpu CRIU ioctl On 6/13/25 20:23, David Francis wrote: > amdgpu CRIU requires an amdgpu CRIU ioctl. This ioctl > has a similar interface to the amdkfd CRIU ioctl. > > The objects that can be checkpointed and restored are bos and vm > mappings. Because a single amdgpu bo can have multiple mappings. > the mappings are recorded separately. > > The ioctl has two modes: PROCESS_INFO, which sends to the user > how many bos and vms to expect, and CHECKPOINT, which copies > data about bos and vms into user-provided buffers. > > Restore is handled using existing amdgpu and drm ioctls. > > The new ioctl lives in a new file amdgpu_criu.c with its own > header amdgpu_criu.h > > Signed-off-by: David Francis <David.Francis@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c | 234 +++++++++++++++++++++++ > drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h | 34 ++++ > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 + > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 11 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 10 + > drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 + > include/uapi/drm/amdgpu_drm.h | 62 ++++++ > 8 files changed, 357 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c > create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h > > diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile > index 87080c06e5fc..0863edcdd03f 100644 > --- a/drivers/gpu/drm/amd/amdgpu/Makefile > +++ b/drivers/gpu/drm/amd/amdgpu/Makefile > @@ -63,7 +63,7 @@ amdgpu-y += amdgpu_device.o amdgpu_doorbell_mgr.o amdgpu_kms.o \ > amdgpu_xgmi.o amdgpu_csa.o amdgpu_ras.o amdgpu_vm_cpu.o \ > amdgpu_vm_sdma.o amdgpu_discovery.o amdgpu_ras_eeprom.o amdgpu_nbio.o \ > amdgpu_umc.o smu_v11_0_i2c.o amdgpu_fru_eeprom.o amdgpu_rap.o \ > - amdgpu_fw_attestation.o amdgpu_securedisplay.o \ > + amdgpu_fw_attestation.o amdgpu_securedisplay.o amdgpu_criu.o \ > amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \ > amdgpu_ring_mux.o amdgpu_xcp.o amdgpu_seq64.o amdgpu_aca.o amdgpu_dev_coredump.o \ > amdgpu_cper.o amdgpu_userq_fence.o amdgpu_eviction_fence.o > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c > new file mode 100644 > index 000000000000..8141ab09698c > --- /dev/null > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.c > @@ -0,0 +1,234 @@ > +/* SPDX-License-Identifier: MIT */ > +/* > +* Copyright 2025 Advanced Micro Devices, Inc. > +* > +* Permission is hereby granted, free of charge, to any person obtaining a > +* copy of this software and associated documentation files (the "Software"), > +* to deal in the Software without restriction, including without limitation > +* the rights to use, copy, modify, merge, publish, distribute, sublicense, > +* and/or sell copies of the Software, and to permit persons to whom the > +* Software is furnished to do so, subject to the following conditions: > +* > +* The above copyright notice and this permission notice shall be included in > +* all copies or substantial portions of the Software. > +* > +* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > +* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > +* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > +* THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > +* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > +* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > +* OTHER DEALINGS IN THE SOFTWARE. > +*/ > + > +#include <linux/dma-buf.h> > +#include <linux/hashtable.h> > +#include <linux/mutex.h> > +#include <linux/random.h> > + > +#include <drm/amdgpu_drm.h> > +#include <drm/drm_device.h> > +#include <drm/drm_file.h> > + > +#include "amdgpu_criu.h" > + > +#include <drm/amdgpu_drm.h> > +#include <drm/drm_drv.h> > +#include <drm/drm_exec.h> > +#include <drm/drm_gem_ttm_helper.h> > +#include <drm/ttm/ttm_tt.h> > +#include <linux/interval_tree_generic.h> > + > +#include "amdgpu.h" > +#include "amdgpu_display.h" > +#include "amdgpu_dma_buf.h" > +#include "amdgpu_hmm.h" > +#include "amdgpu_xgmi.h" > + > +static uint32_t hardware_flags_to_uapi_flags(struct amdgpu_device *adev, uint64_t pte_flags) > +{ > + uint32_t gem_flags = 0; > + > + //This function will be replaced by the mapping flags rework > + > + if (pte_flags & AMDGPU_PTE_EXECUTABLE) > + gem_flags |= AMDGPU_VM_PAGE_EXECUTABLE; > + if (pte_flags & AMDGPU_PTE_READABLE) > + gem_flags |= AMDGPU_VM_PAGE_READABLE; > + if (pte_flags & AMDGPU_PTE_WRITEABLE) > + gem_flags |= AMDGPU_VM_PAGE_WRITEABLE; > + if (pte_flags & AMDGPU_PTE_PRT_FLAG(adev)) > + gem_flags |= AMDGPU_VM_PAGE_PRT; > + if (pte_flags & AMDGPU_PTE_NOALLOC) > + gem_flags |= AMDGPU_VM_PAGE_NOALLOC; > + > + return gem_flags; > +} > + > + > +/** > + * amdgpu_criu_bo_info_ioctl - get information about a process' buffer objects > + * > + * @dev: drm device pointer > + * @data: drm_amdgpu_criu_bo_info_args > + * @filp: drm file pointer > + * > + * num_bos is set as an input to the size of the bo_buckets array. > + * num_bos is sent back as output as the number of bos in the process. > + * If that number is larger than the size of the array, the ioctl must > + * be retried. > + * > + * Returns: > + * 0 for success, -errno for errors. > + */ > +int amdgpu_criu_bo_info_ioctl(struct drm_device *dev, void *data, > + struct drm_file *filp) > +{ > + struct drm_amdgpu_criu_bo_info_args *args = data; > + struct drm_amdgpu_criu_bo_bucket *bo_buckets; > + struct drm_gem_object *gobj; > + int id, ret = 0; > + int bo_index = 0; > + int num_bos = 0; > + > + spin_lock(&filp->table_lock); > + idr_for_each_entry(&filp->object_idr, gobj, id) > + num_bos += 1; > + spin_unlock(&filp->table_lock); > + > + if (args->num_bos < num_bos) { > + args->num_bos = num_bos; > + goto exit; > + } > + args->num_bos = num_bos; > + if (num_bos == 0) { > + goto exit; > + } > + > + bo_buckets = kvzalloc(num_bos * sizeof(*bo_buckets), GFP_KERNEL); > + if (!bo_buckets) { > + ret = -ENOMEM; > + goto free_buckets; > + } > + > + spin_lock(&filp->table_lock); > + idr_for_each_entry(&filp->object_idr, gobj, id) { > + struct amdgpu_bo *bo = gem_to_amdgpu_bo(gobj); > + struct drm_amdgpu_criu_bo_bucket *bo_bucket; > + > + bo_bucket = &bo_buckets[bo_index]; > + > + bo_bucket->size = amdgpu_bo_size(bo); > + bo_bucket->offset = amdgpu_bo_mmap_offset(bo); Not all BOs have a MMAP offset. Additional to that we already have an IOCTL to query the mmap offset which does the appropriate checks and doesn't potentially crash when there is none. > + bo_bucket->alloc_flags = bo->flags & (~AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE); > + bo_bucket->preferred_domains = bo->preferred_domains; > + bo_bucket->gem_handle = id; > + > + if (bo->tbo.base.import_attach) > + bo_bucket->flags |= AMDGPU_CRIU_BO_FLAG_IS_IMPORT; > + > + bo_index += 1; > + } > + spin_unlock(&filp->table_lock); > + > + ret = copy_to_user((void __user *)args->bo_buckets, bo_buckets, num_bos * sizeof(*bo_buckets)); > + if (ret) { > + pr_debug("Failed to copy BO information to user\n"); > + ret = -EFAULT; > + } > + > +free_buckets: > + kvfree(bo_buckets); > +exit: > + > + return ret; > +} > + > +/** > + * amdgpu_criu_bo_info_ioctl - get information about a process' buffer objects > + * > + * @dev: drm device pointer > + * @data: drm_amdgpu_criu_mapping_info_args > + * @filp: drm file pointer > + * > + * num_mappings is set as an input to the size of the vm_buckets array. > + * num_mappings is sent back as output as the number of mappings the bo has. > + * If that number is larger than the size of the array, the ioctl must > + * be retried. > + * > + * Returns: > + * 0 for success, -errno for errors. > + */ > +int amdgpu_criu_mapping_info_ioctl(struct drm_device *dev, void *data, > + struct drm_file *filp) Please split adding that into a separate patch. > +{ > + struct drm_amdgpu_criu_mapping_info_args *args = data; > + struct drm_gem_object *gobj = idr_find(&filp->object_idr, args->gem_handle); > + struct amdgpu_vm *avm = &((struct amdgpu_fpriv *)filp->driver_priv)->vm; > + struct amdgpu_bo *bo = gem_to_amdgpu_bo(gobj); > + struct amdgpu_bo_va *bo_va = amdgpu_vm_bo_find(avm, bo); > + struct amdgpu_fpriv *fpriv = filp->driver_priv; > + struct drm_amdgpu_criu_vm_bucket *vm_buckets; > + struct amdgpu_bo_va_mapping *mapping; > + struct drm_exec exec; > + int vm_index = 0; That needs a better name. > + int ret = 0; Don't initialize return variables if you don't need it. That is very bad practice. > + > + vm_buckets = kvzalloc(args->num_mappings * sizeof(*vm_buckets), GFP_KERNEL); > + if (!vm_buckets) { > + ret = -ENOMEM; > + goto free_vms; > + } > + > + drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT | > + DRM_EXEC_IGNORE_DUPLICATES, 0); > + drm_exec_until_all_locked(&exec) { > + if (gobj) { > + ret = drm_exec_lock_obj(&exec, gobj); > + drm_exec_retry_on_contention(&exec); > + if (ret) > + goto unlock_exec; > + } > + > + ret = amdgpu_vm_lock_pd(&fpriv->vm, &exec, 2); > + drm_exec_retry_on_contention(&exec); > + if (ret) > + goto unlock_exec; > + } > + > + amdgpu_vm_it_for_each_entry(avm, mapping, 0, U64_MAX) { > + if (mapping->bo_va == bo_va) { > + if (vm_index < args->num_mappings) { > + vm_buckets[vm_index].start = mapping->start; > + vm_buckets[vm_index].last = mapping->last; > + vm_buckets[vm_index].offset = mapping->offset; > + vm_buckets[vm_index].flags = hardware_flags_to_uapi_flags(drm_to_adev(dev), mapping->flags); > + } > + vm_index += 1; > + } > + } This chunk should go into amdgpu_vm.c And I strongly suggest to not go over the mapping rb tree but rather the list in the bo_va. Regards, Christian. > + > + drm_exec_fini(&exec); > + > + if (vm_index > 0) { > + if (vm_index <= args->num_mappings) { > + ret = copy_to_user((void __user *)args->vm_buckets, vm_buckets, vm_index * sizeof(*vm_buckets)); > + if (ret) { > + pr_debug("Failed to copy BO information to user\n"); > + ret = -EFAULT; > + } > + } > + } > + args->num_mappings = vm_index; > + > + > + kvfree(vm_buckets); > + > + return ret; > +unlock_exec: > + drm_exec_fini(&exec); > +free_vms: > + kvfree(vm_buckets); > + > + return ret; > +} > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h > new file mode 100644 > index 000000000000..9c196973ed0f > --- /dev/null > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_criu.h > @@ -0,0 +1,34 @@ > +/* SPDX-License-Identifier: MIT */ > +/* > +* Copyright 2025 Advanced Micro Devices, Inc. > +* > +* Permission is hereby granted, free of charge, to any person obtaining a > +* copy of this software and associated documentation files (the "Software"), > +* to deal in the Software without restriction, including without limitation > +* the rights to use, copy, modify, merge, publish, distribute, sublicense, > +* and/or sell copies of the Software, and to permit persons to whom the > +* Software is furnished to do so, subject to the following conditions: > +* > +* The above copyright notice and this permission notice shall be included in > +* all copies or substantial portions of the Software. > +* > +* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > +* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > +* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > +* THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > +* OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > +* ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > +* OTHER DEALINGS IN THE SOFTWARE. > +*/ > + > +#ifndef __AMDGPU_CRIU_H__ > +#define __AMDGPU_CRIU_H__ > + > +#include <drm/amdgpu_drm.h> > + > +int amdgpu_criu_bo_info_ioctl(struct drm_device *dev, void *data, > + struct drm_file *filp); > +int amdgpu_criu_mapping_info_ioctl(struct drm_device *dev, void *data, > + struct drm_file *filp); > + > +#endif > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > index 4db92e0a60da..5f3de93a665d 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > @@ -53,6 +53,7 @@ > #include "amdgpu_xgmi.h" > #include "amdgpu_userq.h" > #include "amdgpu_userq_fence.h" > +#include "amdgpu_criu.h" > #include "../amdxcp/amdgpu_xcp_drv.h" > > /* > @@ -3021,6 +3022,8 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = { > DRM_IOCTL_DEF_DRV(AMDGPU_USERQ, amdgpu_userq_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), > DRM_IOCTL_DEF_DRV(AMDGPU_USERQ_SIGNAL, amdgpu_userq_signal_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), > DRM_IOCTL_DEF_DRV(AMDGPU_USERQ_WAIT, amdgpu_userq_wait_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), > + DRM_IOCTL_DEF_DRV(AMDGPU_CRIU_BO_INFO, amdgpu_criu_bo_info_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), > + DRM_IOCTL_DEF_DRV(AMDGPU_CRIU_MAPPING_INFO, amdgpu_criu_mapping_info_ioctl, DRM_AUTH|DRM_RENDER_ALLOW), > }; > > static const struct drm_driver amdgpu_kms_driver = { > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index 3911c78f8282..f803908cf46d 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -3156,3 +3156,14 @@ bool amdgpu_vm_is_bo_always_valid(struct amdgpu_vm *vm, struct amdgpu_bo *bo) > { > return bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv; > } > + > + > +struct amdgpu_bo_va_mapping *amdgpu_vm_it_first_mapping_in_range(struct amdgpu_vm *avm, uint64_t start, uint64_t end) > +{ > + return amdgpu_vm_it_iter_first(&avm->va, start, end); > +} > + > +struct amdgpu_bo_va_mapping *amdgpu_vm_it_next_mapping_in_range(struct amdgpu_bo_va_mapping *mapping, uint64_t start, uint64_t end) > +{ > + return amdgpu_vm_it_iter_next(mapping, start, end); > +} > \ No newline at end of file > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > index f3ad687125ad..cd7d3940cc7a 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h > @@ -668,4 +668,14 @@ void amdgpu_vm_tlb_fence_create(struct amdgpu_device *adev, > struct amdgpu_vm *vm, > struct dma_fence **fence); > > +struct amdgpu_bo_va_mapping *amdgpu_vm_it_first_mapping_in_range( > + struct amdgpu_vm *avm, uint64_t start, uint64_t end); > +struct amdgpu_bo_va_mapping *amdgpu_vm_it_next_mapping_in_range( > + struct amdgpu_bo_va_mapping *mapping, uint64_t start, uint64_t end); > + > +#define amdgpu_vm_it_for_each_entry(avm, mapping, start, end) \ > + for (mapping = amdgpu_vm_it_first_mapping_in_range(avm, start, end); \ > + mapping; \ > + mapping = amdgpu_vm_it_next_mapping_in_range(mapping, start, end)) > + > #endif > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > index a2149afa5803..a8cf2d4580cc 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c > @@ -45,6 +45,8 @@ > #include "amdgpu_dma_buf.h" > #include "kfd_debug.h" > > +#include "amdgpu_criu.h" > + > static long kfd_ioctl(struct file *, unsigned int, unsigned long); > static int kfd_open(struct inode *, struct file *); > static int kfd_release(struct inode *, struct file *); > diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h > index 45c4fa13499c..16aee825e116 100644 > --- a/include/uapi/drm/amdgpu_drm.h > +++ b/include/uapi/drm/amdgpu_drm.h > @@ -57,6 +57,10 @@ extern "C" { > #define DRM_AMDGPU_USERQ 0x16 > #define DRM_AMDGPU_USERQ_SIGNAL 0x17 > #define DRM_AMDGPU_USERQ_WAIT 0x18 > +#define DRM_AMDGPU_CRIU_OP 0x19 > + > +#define DRM_AMDGPU_CRIU_BO_INFO 0x20 > +#define DRM_AMDGPU_CRIU_MAPPING_INFO 0x21 > > #define DRM_IOCTL_AMDGPU_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create) > #define DRM_IOCTL_AMDGPU_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap) > @@ -77,6 +81,10 @@ extern "C" { > #define DRM_IOCTL_AMDGPU_USERQ DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_USERQ, union drm_amdgpu_userq) > #define DRM_IOCTL_AMDGPU_USERQ_SIGNAL DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_USERQ_SIGNAL, struct drm_amdgpu_userq_signal) > #define DRM_IOCTL_AMDGPU_USERQ_WAIT DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_USERQ_WAIT, struct drm_amdgpu_userq_wait) > +#define DRM_IOCTL_AMDGPU_CRIU_OP DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_CRIU_OP, struct drm_amdgpu_criu_args) > + > +#define DRM_IOCTL_AMDGPU_CRIU_BO_INFO DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_CRIU_BO_INFO, struct drm_amdgpu_criu_bo_info_args) > +#define DRM_IOCTL_AMDGPU_CRIU_MAPPING_INFO DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_CRIU_MAPPING_INFO, struct drm_amdgpu_criu_mapping_info_args) > > /** > * DOC: memory domains > @@ -1626,6 +1634,60 @@ struct drm_color_ctm_3x4 { > __u64 matrix[12]; > }; > > +#define AMDGPU_CRIU_BO_FLAG_IS_IMPORT (1 << 0) > + > +struct drm_amdgpu_criu_bo_info_args { > + /* IN: Size of bo_buckets buffer. OUT: Number of bos in process (if larger than size of buffer, must retry) */ > + __u32 num_bos; > + > + /* User pointer to array of drm_amdgpu_criu_bo_bucket */ > + __u64 bo_buckets; > +}; > + > +struct drm_amdgpu_criu_bo_bucket { > + /* Size of bo */ > + __u64 size; > + > + /* Offset of bo in device file */ > + __u64 offset; > + > + /* GEM_CREATE flags for re-creation of buffer */ > + __u64 alloc_flags; > + > + /* Pending how to handle this; provides information needed to remake the buffer on restore */ > + __u32 preferred_domains; > + > + /* Currently just one flag: IS_IMPORT */ > + __u32 flags; > + > + __u32 gem_handle; > +}; > + > +struct drm_amdgpu_criu_mapping_info_args { > + /* Handle of bo to get mappings of */ > + __u32 gem_handle; > + > + /* IN: Size of vm_buckets buffer. OUT: Number of bos in process (if larger than size of buffer, must retry) */ > + __u32 num_mappings; > + > + /* User pointer to array of drm_amdgpu_criu_vm_bucket */ > + __u64 vm_buckets; > +}; > + > +struct drm_amdgpu_criu_vm_bucket { > + /* Start of mapping (in number of pages) */ > + __u64 start; > + > + /* End of mapping (in number of pages) */ > + __u64 last; > + > + /* Mapping offset */ > + __u64 offset; > + > + /* flags needed to recreate mapping; still pending how to get these */ > + __u64 flags; > +}; > + > #if defined(__cplusplus) > } > #endif [-- Attachment #2: Type: text/html, Size: 38757 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 3/3] drm/amdgpu: Allow kfd CRIU with no buffer objects 2025-06-13 18:23 [PATCH v6] Add CRIU support for amdgpu dmabuf David Francis 2025-06-13 18:23 ` [PATCH 1/3] drm: Add DRM prime interfaces to reassign GEM handle David Francis 2025-06-13 18:23 ` [PATCH 2/3] drm/amdgpu: Adding amdgpu CRIU ioctl David Francis @ 2025-06-13 18:23 ` David Francis 2 siblings, 0 replies; 10+ messages in thread From: David Francis @ 2025-06-13 18:23 UTC (permalink / raw) To: dri-devel Cc: tvrtko.ursulin, Felix.Kuehling, David.YatSin, Chris.Freehill, Christian.Koenig, dcostantino, sruffell, simona, mripard, tzimmermann, David Francis The kfd CRIU checkpoint ioctl would return an error if trying to checkpoint a process with no kfd buffer objects. This is a normal case and should not be an error. Signed-off-by: David Francis <David.Francis@amd.com> --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index a8cf2d4580cc..9617c696d5b6 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2569,8 +2569,8 @@ static int criu_restore(struct file *filep, pr_debug("CRIU restore (num_devices:%u num_bos:%u num_objects:%u priv_data_size:%llu)\n", args->num_devices, args->num_bos, args->num_objects, args->priv_data_size); - if (!args->bos || !args->devices || !args->priv_data || !args->priv_data_size || - !args->num_devices || !args->num_bos) + if ((args->num_bos > 0 && !args->bos) || !args->devices || !args->priv_data || + !args->priv_data_size || !args->num_devices) return -EINVAL; mutex_lock(&p->mutex); -- 2.34.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v5] Add CRIU support for amdgpu dmabuf @ 2025-05-21 14:06 David Francis 2025-05-21 14:06 ` [PATCH 1/3] drm: Add DRM prime interfaces to reassign GEM handle David Francis 0 siblings, 1 reply; 10+ messages in thread From: David Francis @ 2025-05-21 14:06 UTC (permalink / raw) To: dri-devel Cc: tvrtko.ursulin, Felix.Kuehling, David.YatSin, Chris.Freehill, Christian.Koenig, dcostantino, sruffell, simona, mripard, tzimmermann (previous patches were incorrectly called v3 but were actually the 4th version) This patch series adds support for CRIU checkpointing of processes that share memory with the amdgpu dmabuf interface. In this v5, the drm interfaces have been changed from creating buffer objects with specified gem handles to changing the gem handle of an existing buffer object. This new ioctl (AMDGPU_GEM_CHANGE_HANDLE) is in drm_gem.c In the accompanying CRIU patch set, the sockets used by CRIU to exchange dmabuf fds between restoring processes have been refactored to be created within the amdgpu plugin itself. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/3] drm: Add DRM prime interfaces to reassign GEM handle 2025-05-21 14:06 [PATCH v5] Add CRIU support for amdgpu dmabuf David Francis @ 2025-05-21 14:06 ` David Francis 2025-05-21 14:42 ` Christian König 0 siblings, 1 reply; 10+ messages in thread From: David Francis @ 2025-05-21 14:06 UTC (permalink / raw) To: dri-devel Cc: tvrtko.ursulin, Felix.Kuehling, David.YatSin, Chris.Freehill, Christian.Koenig, dcostantino, sruffell, simona, mripard, tzimmermann, David Francis CRIU restore of drm buffer objects requires the ability to create or import a buffer object with a specific gem handle. Add new drm ioctl DRM_IOCTL_GEM_CHANGE_HANDLE, which takes the gem handle of an object and moves that object to a specified new gem handle. This ioctl needs to call drm_prime_remove_buf_handle, but that function acquires the prime lock, which the ioctl needs to hold for other purposes. Make drm_prime_remove_buf_handle not acquire the prime lock, and change its other caller to reflect this. Signed-off-by: David Francis <David.Francis@amd.com> --- drivers/gpu/drm/drm_gem.c | 52 ++++++++++++++++++++++++++++++++++ drivers/gpu/drm/drm_internal.h | 4 +++ drivers/gpu/drm/drm_ioctl.c | 1 + drivers/gpu/drm/drm_prime.c | 6 +--- include/uapi/drm/drm.h | 17 +++++++++++ 5 files changed, 75 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index c6240bab3fa5..d388bbb7a9de 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -282,7 +282,12 @@ drm_gem_object_release_handle(int id, void *ptr, void *data) if (obj->funcs->close) obj->funcs->close(obj, file_priv); + mutex_lock(&file_priv->prime.lock); + drm_prime_remove_buf_handle(&file_priv->prime, id); + + mutex_unlock(&file_priv->prime.lock); + drm_vma_node_revoke(&obj->vma_node, file_priv); drm_gem_object_handle_put_unlocked(obj); @@ -888,6 +893,53 @@ drm_gem_flink_ioctl(struct drm_device *dev, void *data, return ret; } +/** + * drm_gem_open_ioctl - implementation of the GEM_CHANGE_HANDLE ioctl + * @dev: drm_device + * @data: ioctl data + * @file_priv: drm file-private structure + * + * find the object at the specified gem handle. Remove it from that handle, and assign it + * the specified new handle. + */ +int drm_gem_change_handle_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct drm_gem_change_handle *args = data; + struct drm_gem_object *obj; + int ret; + + obj = drm_gem_object_lookup(file_priv, args->handle); + if (!obj) + return -ENOENT; + + if (args->handle == args->new_handle) + return 0; + + get_dma_buf(obj->dma_buf); + mutex_lock(&file_priv->prime.lock); + spin_lock(&file_priv->table_lock); + + ret = idr_alloc(&file_priv->object_idr, obj, args->new_handle, args->new_handle + 1, GFP_NOWAIT); + if (ret < 0) + goto out_unlock; + + ret = drm_prime_add_buf_handle(&file_priv->prime, obj->dma_buf, args->new_handle); + if (ret < 0) + goto out_unlock; + + drm_prime_remove_buf_handle(&file_priv->prime, args->handle); + + idr_remove(&file_priv->object_idr, args->handle); + +out_unlock: + spin_unlock(&file_priv->table_lock); + mutex_unlock(&file_priv->prime.lock); + dma_buf_put(obj->dma_buf); + + return ret; +} + /** * drm_gem_open_ioctl - implementation of the GEM_OPEN ioctl * @dev: drm_device diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h index b2b6a8e49dda..e9d5cdf7e033 100644 --- a/drivers/gpu/drm/drm_internal.h +++ b/drivers/gpu/drm/drm_internal.h @@ -85,6 +85,8 @@ int drm_prime_fd_to_handle_ioctl(struct drm_device *dev, void *data, void drm_prime_init_file_private(struct drm_prime_file_private *prime_fpriv); void drm_prime_destroy_file_private(struct drm_prime_file_private *prime_fpriv); +int drm_prime_add_buf_handle(struct drm_prime_file_private *prime_fpriv, + struct dma_buf *dma_buf, uint32_t handle); void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv, uint32_t handle); @@ -168,6 +170,8 @@ int drm_gem_close_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int drm_gem_flink_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); +int drm_gem_change_handle_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv); int drm_gem_open_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); void drm_gem_open(struct drm_device *dev, struct drm_file *file_private); diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index f593dc569d31..d8a24875a7ba 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -653,6 +653,7 @@ static const struct drm_ioctl_desc drm_ioctls[] = { DRM_IOCTL_DEF(DRM_IOCTL_GEM_CLOSE, drm_gem_close_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_GEM_FLINK, drm_gem_flink_ioctl, DRM_AUTH), DRM_IOCTL_DEF(DRM_IOCTL_GEM_OPEN, drm_gem_open_ioctl, DRM_AUTH), + DRM_IOCTL_DEF(DRM_IOCTL_GEM_CHANGE_HANDLE, drm_gem_change_handle_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETRESOURCES, drm_mode_getresources, 0), diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index bdb51c8f262e..1f2e858e5000 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -93,7 +93,7 @@ struct drm_prime_member { struct rb_node handle_rb; }; -static int drm_prime_add_buf_handle(struct drm_prime_file_private *prime_fpriv, +int drm_prime_add_buf_handle(struct drm_prime_file_private *prime_fpriv, struct dma_buf *dma_buf, uint32_t handle) { struct drm_prime_member *member; @@ -190,8 +190,6 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv, { struct rb_node *rb; - mutex_lock(&prime_fpriv->lock); - rb = prime_fpriv->handles.rb_node; while (rb) { struct drm_prime_member *member; @@ -210,8 +208,6 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv, rb = rb->rb_left; } } - - mutex_unlock(&prime_fpriv->lock); } void drm_prime_init_file_private(struct drm_prime_file_private *prime_fpriv) diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index 7fba37b94401..84c819c171d2 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -625,6 +625,15 @@ struct drm_gem_open { __u64 size; }; +/* DRM_IOCTL_GEM_CHANGE_HANDLE ioctl argument type */ +struct drm_gem_change_handle { + /** Current handle of object */ + __u32 handle; + + /** Handle to change that object to */ + __u32 new_handle; +}; + /** * DRM_CAP_DUMB_BUFFER * @@ -1305,6 +1314,14 @@ extern "C" { */ #define DRM_IOCTL_SET_CLIENT_NAME DRM_IOWR(0xD1, struct drm_set_client_name) +/** + * DRM_IOCTL_GEM_CHANGE_HANDLE - Move an object to a different handle + * + * Some applications (notably CRIU) need objects to have specific gem handles. + * This ioctl changes the object at one gem handle to use a new gem handle. + */ +#define DRM_IOCTL_GEM_CHANGE_HANDLE DRM_IOWR(0xD2, struct drm_gem_change_handle) + /* * Device specific ioctls should only be in their respective headers * The device specific ioctl range is from 0x40 to 0x9f. -- 2.34.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] drm: Add DRM prime interfaces to reassign GEM handle 2025-05-21 14:06 ` [PATCH 1/3] drm: Add DRM prime interfaces to reassign GEM handle David Francis @ 2025-05-21 14:42 ` Christian König 0 siblings, 0 replies; 10+ messages in thread From: Christian König @ 2025-05-21 14:42 UTC (permalink / raw) To: David Francis, dri-devel Cc: tvrtko.ursulin, Felix.Kuehling, David.YatSin, Chris.Freehill, dcostantino, sruffell, simona, mripard, tzimmermann On 5/21/25 16:06, David Francis wrote: > CRIU restore of drm buffer objects requires the ability to create > or import a buffer object with a specific gem handle. > > Add new drm ioctl DRM_IOCTL_GEM_CHANGE_HANDLE, which takes > the gem handle of an object and moves that object to a > specified new gem handle. > > This ioctl needs to call drm_prime_remove_buf_handle, > but that function acquires the prime lock, which the ioctl > needs to hold for other purposes. > > Make drm_prime_remove_buf_handle not acquire the prime lock, > and change its other caller to reflect this. > > Signed-off-by: David Francis <David.Francis@amd.com> > --- > drivers/gpu/drm/drm_gem.c | 52 ++++++++++++++++++++++++++++++++++ > drivers/gpu/drm/drm_internal.h | 4 +++ > drivers/gpu/drm/drm_ioctl.c | 1 + > drivers/gpu/drm/drm_prime.c | 6 +--- > include/uapi/drm/drm.h | 17 +++++++++++ > 5 files changed, 75 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c > index c6240bab3fa5..d388bbb7a9de 100644 > --- a/drivers/gpu/drm/drm_gem.c > +++ b/drivers/gpu/drm/drm_gem.c > @@ -282,7 +282,12 @@ drm_gem_object_release_handle(int id, void *ptr, void *data) > if (obj->funcs->close) > obj->funcs->close(obj, file_priv); > > + mutex_lock(&file_priv->prime.lock); > + > drm_prime_remove_buf_handle(&file_priv->prime, id); > + > + mutex_unlock(&file_priv->prime.lock); > + > drm_vma_node_revoke(&obj->vma_node, file_priv); > > drm_gem_object_handle_put_unlocked(obj); > @@ -888,6 +893,53 @@ drm_gem_flink_ioctl(struct drm_device *dev, void *data, > return ret; > } > > +/** > + * drm_gem_open_ioctl - implementation of the GEM_CHANGE_HANDLE ioctl > + * @dev: drm_device > + * @data: ioctl data > + * @file_priv: drm file-private structure > + * > + * find the object at the specified gem handle. Remove it from that handle, and assign it > + * the specified new handle. > + */ > +int drm_gem_change_handle_ioctl(struct drm_device *dev, void *data, > + struct drm_file *file_priv) > +{ > + struct drm_gem_change_handle *args = data; > + struct drm_gem_object *obj; > + int ret; > + > + obj = drm_gem_object_lookup(file_priv, args->handle); > + if (!obj) > + return -ENOENT; > + > + if (args->handle == args->new_handle) > + return 0; > + > + get_dma_buf(obj->dma_buf); That is unecessary now that the new handle is made valid before the old one is removed. > + mutex_lock(&file_priv->prime.lock); > + spin_lock(&file_priv->table_lock); > + > + ret = idr_alloc(&file_priv->object_idr, obj, args->new_handle, args->new_handle + 1, GFP_NOWAIT); > + if (ret < 0) > + goto out_unlock; > + > + ret = drm_prime_add_buf_handle(&file_priv->prime, obj->dma_buf, args->new_handle); That function allocates memory and so can't easily be called in atomic context. In other words you need to drop the file_priv->table_lock spinlock before calling it. > + if (ret < 0) > + goto out_unlock; > + > + drm_prime_remove_buf_handle(&file_priv->prime, args->handle); > + > + idr_remove(&file_priv->object_idr, args->handle); Then re-acquire the spinlock before calling this here. Regards, Christian. > + > +out_unlock: > + spin_unlock(&file_priv->table_lock); > + mutex_unlock(&file_priv->prime.lock); > + dma_buf_put(obj->dma_buf); > + > + return ret; > +} > + > /** > * drm_gem_open_ioctl - implementation of the GEM_OPEN ioctl > * @dev: drm_device > diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h > index b2b6a8e49dda..e9d5cdf7e033 100644 > --- a/drivers/gpu/drm/drm_internal.h > +++ b/drivers/gpu/drm/drm_internal.h > @@ -85,6 +85,8 @@ int drm_prime_fd_to_handle_ioctl(struct drm_device *dev, void *data, > > void drm_prime_init_file_private(struct drm_prime_file_private *prime_fpriv); > void drm_prime_destroy_file_private(struct drm_prime_file_private *prime_fpriv); > +int drm_prime_add_buf_handle(struct drm_prime_file_private *prime_fpriv, > + struct dma_buf *dma_buf, uint32_t handle); > void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv, > uint32_t handle); > > @@ -168,6 +170,8 @@ int drm_gem_close_ioctl(struct drm_device *dev, void *data, > struct drm_file *file_priv); > int drm_gem_flink_ioctl(struct drm_device *dev, void *data, > struct drm_file *file_priv); > +int drm_gem_change_handle_ioctl(struct drm_device *dev, void *data, > + struct drm_file *file_priv); > int drm_gem_open_ioctl(struct drm_device *dev, void *data, > struct drm_file *file_priv); > void drm_gem_open(struct drm_device *dev, struct drm_file *file_private); > diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c > index f593dc569d31..d8a24875a7ba 100644 > --- a/drivers/gpu/drm/drm_ioctl.c > +++ b/drivers/gpu/drm/drm_ioctl.c > @@ -653,6 +653,7 @@ static const struct drm_ioctl_desc drm_ioctls[] = { > DRM_IOCTL_DEF(DRM_IOCTL_GEM_CLOSE, drm_gem_close_ioctl, DRM_RENDER_ALLOW), > DRM_IOCTL_DEF(DRM_IOCTL_GEM_FLINK, drm_gem_flink_ioctl, DRM_AUTH), > DRM_IOCTL_DEF(DRM_IOCTL_GEM_OPEN, drm_gem_open_ioctl, DRM_AUTH), > + DRM_IOCTL_DEF(DRM_IOCTL_GEM_CHANGE_HANDLE, drm_gem_change_handle_ioctl, DRM_RENDER_ALLOW), > > DRM_IOCTL_DEF(DRM_IOCTL_MODE_GETRESOURCES, drm_mode_getresources, 0), > > diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c > index bdb51c8f262e..1f2e858e5000 100644 > --- a/drivers/gpu/drm/drm_prime.c > +++ b/drivers/gpu/drm/drm_prime.c > @@ -93,7 +93,7 @@ struct drm_prime_member { > struct rb_node handle_rb; > }; > > -static int drm_prime_add_buf_handle(struct drm_prime_file_private *prime_fpriv, > +int drm_prime_add_buf_handle(struct drm_prime_file_private *prime_fpriv, > struct dma_buf *dma_buf, uint32_t handle) > { > struct drm_prime_member *member; > @@ -190,8 +190,6 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv, > { > struct rb_node *rb; > > - mutex_lock(&prime_fpriv->lock); > - > rb = prime_fpriv->handles.rb_node; > while (rb) { > struct drm_prime_member *member; > @@ -210,8 +208,6 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv, > rb = rb->rb_left; > } > } > - > - mutex_unlock(&prime_fpriv->lock); > } > > void drm_prime_init_file_private(struct drm_prime_file_private *prime_fpriv) > diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h > index 7fba37b94401..84c819c171d2 100644 > --- a/include/uapi/drm/drm.h > +++ b/include/uapi/drm/drm.h > @@ -625,6 +625,15 @@ struct drm_gem_open { > __u64 size; > }; > > +/* DRM_IOCTL_GEM_CHANGE_HANDLE ioctl argument type */ > +struct drm_gem_change_handle { > + /** Current handle of object */ > + __u32 handle; > + > + /** Handle to change that object to */ > + __u32 new_handle; > +}; > + > /** > * DRM_CAP_DUMB_BUFFER > * > @@ -1305,6 +1314,14 @@ extern "C" { > */ > #define DRM_IOCTL_SET_CLIENT_NAME DRM_IOWR(0xD1, struct drm_set_client_name) > > +/** > + * DRM_IOCTL_GEM_CHANGE_HANDLE - Move an object to a different handle > + * > + * Some applications (notably CRIU) need objects to have specific gem handles. > + * This ioctl changes the object at one gem handle to use a new gem handle. > + */ > +#define DRM_IOCTL_GEM_CHANGE_HANDLE DRM_IOWR(0xD2, struct drm_gem_change_handle) > + > /* > * Device specific ioctls should only be in their respective headers > * The device specific ioctl range is from 0x40 to 0x9f. ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20250516193336.3531982-1-David.Francis@amd.com>]
[parent not found: <20250516193336.3531982-2-David.Francis@amd.com>]
* Re: [PATCH 1/3] drm: Add DRM prime interfaces to reassign GEM handle [not found] ` <20250516193336.3531982-2-David.Francis@amd.com> @ 2025-05-19 12:29 ` Christian König 0 siblings, 0 replies; 10+ messages in thread From: Christian König @ 2025-05-19 12:29 UTC (permalink / raw) To: David Francis, amd-gfx, dri-devel@lists.freedesktop.org, Tvrtko Ursulin, Simona Vetter Cc: Felix.Kuehling, David.YatSin, Chris.Freehill, dcostantino, sruffell On 5/16/25 21:33, David Francis wrote: > CRIU restore of drm buffer objects requires the ability to create > or import a buffer object with a specific gem handle. > > Add new drm ioctl DRM_IOCTL_PRIME_CHANGE_GEM_HANDLE, which takes > the gem handle of an object and moves that object to a > specified new gem handle. > > This ioctl needs to call drm_prime_remove_buf_handle, > but that function acquires the prime lock, which the ioctl > needs to hold for other purposes. > > Make drm_prime_remove_buf_handle not acquire the prime lock, > and change its other caller to reflect this. > > Signed-off-by: David Francis <David.Francis@amd.com> First of all you need to add more people and the dri-devel mailing list for review. A few more comments below. > --- > drivers/gpu/drm/drm_gem.c | 5 ++++ > drivers/gpu/drm/drm_internal.h | 2 ++ > drivers/gpu/drm/drm_ioctl.c | 1 + > drivers/gpu/drm/drm_prime.c | 43 ++++++++++++++++++++++++++++++---- > include/uapi/drm/drm.h | 7 ++++++ > 5 files changed, 54 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c > index ee811764c3df..f56eeed808d2 100644 > --- a/drivers/gpu/drm/drm_gem.c > +++ b/drivers/gpu/drm/drm_gem.c > @@ -282,7 +282,12 @@ drm_gem_object_release_handle(int id, void *ptr, void *data) > if (obj->funcs->close) > obj->funcs->close(obj, file_priv); > > + mutex_lock(&file_priv->prime.lock); > + > drm_prime_remove_buf_handle(&file_priv->prime, id); > + > + mutex_unlock(&file_priv->prime.lock); > + > drm_vma_node_revoke(&obj->vma_node, file_priv); > > drm_gem_object_handle_put_unlocked(obj); > diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h > index b2b6a8e49dda..7437e5eaf8db 100644 > --- a/drivers/gpu/drm/drm_internal.h > +++ b/drivers/gpu/drm/drm_internal.h > @@ -82,6 +82,8 @@ int drm_prime_handle_to_fd_ioctl(struct drm_device *dev, void *data, > struct drm_file *file_priv); > int drm_prime_fd_to_handle_ioctl(struct drm_device *dev, void *data, > struct drm_file *file_priv); > +int drm_prime_change_gem_handle_ioctl(struct drm_device *dev, void *data, > + struct drm_file *file_priv); > > void drm_prime_init_file_private(struct drm_prime_file_private *prime_fpriv); > void drm_prime_destroy_file_private(struct drm_prime_file_private *prime_fpriv); > diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c > index f593dc569d31..f9b531311aac 100644 > --- a/drivers/gpu/drm/drm_ioctl.c > +++ b/drivers/gpu/drm/drm_ioctl.c > @@ -658,6 +658,7 @@ static const struct drm_ioctl_desc drm_ioctls[] = { > > DRM_IOCTL_DEF(DRM_IOCTL_PRIME_HANDLE_TO_FD, drm_prime_handle_to_fd_ioctl, DRM_RENDER_ALLOW), > DRM_IOCTL_DEF(DRM_IOCTL_PRIME_FD_TO_HANDLE, drm_prime_fd_to_handle_ioctl, DRM_RENDER_ALLOW), > + DRM_IOCTL_DEF(DRM_IOCTL_PRIME_CHANGE_GEM_HANDLE, drm_prime_change_gem_handle_ioctl, DRM_RENDER_ALLOW), > > DRM_IOCTL_DEF(DRM_IOCTL_SET_CLIENT_NAME, drm_set_client_name, DRM_RENDER_ALLOW), > > diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c > index 0e3f8adf162f..8f3317fcae49 100644 > --- a/drivers/gpu/drm/drm_prime.c > +++ b/drivers/gpu/drm/drm_prime.c > @@ -190,8 +190,6 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv, > { > struct rb_node *rb; > > - mutex_lock(&prime_fpriv->lock); > - > rb = prime_fpriv->handles.rb_node; > while (rb) { > struct drm_prime_member *member; > @@ -210,8 +208,6 @@ void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv, > rb = rb->rb_left; > } > } > - > - mutex_unlock(&prime_fpriv->lock); > } > > void drm_prime_init_file_private(struct drm_prime_file_private *prime_fpriv) > @@ -1084,3 +1080,42 @@ void drm_prime_gem_destroy(struct drm_gem_object *obj, struct sg_table *sg) > dma_buf_put(dma_buf); > } > EXPORT_SYMBOL(drm_prime_gem_destroy); > + > +int drm_prime_change_gem_handle_ioctl(struct drm_device *dev, void *data, > + struct drm_file *file_priv) This is not a prime function, but rather a GEM function. So the new code should go into drm_gem.c, I suggest close to the drm_gem_open_ioctl() function. > +{ > + struct drm_prime_change_gem_handle *args = data; > + struct drm_gem_object *obj; > + int ret; > + > + obj = drm_gem_object_lookup(file_priv, args->handle); > + > + if (!obj) > + return -ENOENT; > + > + get_dma_buf(obj->dma_buf); > + > + mutex_lock(&file_priv->prime.lock); > + > + drm_prime_remove_buf_handle(&file_priv->prime, args->handle); > + > + spin_lock(&file_priv->table_lock); > + > + idr_remove(&file_priv->object_idr, args->handle); > + ret = idr_alloc(&file_priv->object_idr, obj, args->new_handle, args->new_handle + 1, GFP_NOWAIT); This needs to be reordered. First call idr_alloc() to check if the new handle is actually available. Then call drm_prime_add_buf_handle() when the GEM object actually has a DMA-buf associated with it. When that worked call drm_prime_remove_buf_handle() and finally idr_remove(). Regards, Christian. > + > + spin_unlock(&file_priv->table_lock); > + > + if (ret < 0) > + goto out_unlock; > + > + ret = drm_prime_add_buf_handle(&file_priv->prime, obj->dma_buf, args->new_handle); > + if (ret < 0) > + goto out_unlock; > + > +out_unlock: > + mutex_unlock(&file_priv->prime.lock); > + dma_buf_put(obj->dma_buf); > + > + return ret; > +} > diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h > index 7fba37b94401..ae701b8f9314 100644 > --- a/include/uapi/drm/drm.h > +++ b/include/uapi/drm/drm.h > @@ -625,6 +625,11 @@ struct drm_gem_open { > __u64 size; > }; > > +struct drm_prime_change_gem_handle { > + __u32 handle; > + __u32 new_handle; > +}; > + > /** > * DRM_CAP_DUMB_BUFFER > * > @@ -1305,6 +1310,8 @@ extern "C" { > */ > #define DRM_IOCTL_SET_CLIENT_NAME DRM_IOWR(0xD1, struct drm_set_client_name) > > +#define DRM_IOCTL_PRIME_CHANGE_GEM_HANDLE DRM_IOWR(0xD2, struct drm_prime_change_gem_handle) > + > /* > * Device specific ioctls should only be in their respective headers > * The device specific ioctl range is from 0x40 to 0x9f. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-06-16 14:15 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-06-13 18:23 [PATCH v6] Add CRIU support for amdgpu dmabuf David Francis 2025-06-13 18:23 ` [PATCH 1/3] drm: Add DRM prime interfaces to reassign GEM handle David Francis 2025-06-16 8:51 ` Christian König 2025-06-13 18:23 ` [PATCH 2/3] drm/amdgpu: Adding amdgpu CRIU ioctl David Francis 2025-06-16 9:01 ` Christian König 2025-06-16 14:15 ` Francis, David 2025-06-13 18:23 ` [PATCH 3/3] drm/amdgpu: Allow kfd CRIU with no buffer objects David Francis -- strict thread matches above, loose matches on Subject: below -- 2025-05-21 14:06 [PATCH v5] Add CRIU support for amdgpu dmabuf David Francis 2025-05-21 14:06 ` [PATCH 1/3] drm: Add DRM prime interfaces to reassign GEM handle David Francis 2025-05-21 14:42 ` Christian König [not found] <20250516193336.3531982-1-David.Francis@amd.com> [not found] ` <20250516193336.3531982-2-David.Francis@amd.com> 2025-05-19 12:29 ` Christian König
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).