From: "Christian König" <christian.koenig@amd.com>
To: Felix Kuehling <felix.kuehling@amd.com>,
Intel Graphics Development <intel-gfx@lists.freedesktop.org>,
dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 3/9] drm/ttm: use per BO cleanup workers
Date: Mon, 5 Dec 2022 14:39:41 +0100 [thread overview]
Message-ID: <a7c5f157-ff42-4e87-cc79-33ba6a15a138@amd.com> (raw)
In-Reply-To: <8ff841e3-8eef-9ec2-2ba5-4907f18873c0@amd.com>
Am 29.11.22 um 22:14 schrieb Felix Kuehling:
> On 2022-11-25 05:21, Christian König wrote:
>> Instead of a single worker going over the list of delete BOs in regular
>> intervals use a per BO worker which blocks for the resv object and
>> locking of the BO.
>>
>> This not only simplifies the handling massively, but also results in
>> much better response time when cleaning up buffers.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>
> Just thinking out loud: If I understand it correctly, this can cause a
> lot of sleeping worker threads when
> AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE is used and many BOs are freed
> at the same time. This happens e.g. when a KFD process terminates or
> crashes. I guess with a concurrency-managed workqueue this isn't going
> to be excessive. And since it's on a per device workqueue, it doesn't
> stall work items on the system work queue or from other devices.
Yes, exactly that. The last parameter to alloc_workqueue() limits how
many work items can be sleeping.
> I'm trying to understand why you set WQ_MEM_RECLAIM. This work queue
> is not about freeing ttm_resources but about freeing the BOs. But it
> affects freeing of ghost_objs that are holding the ttm_resources being
> freed.
Well if the BO is idle, but not immediately lockable we delegate freeing
the backing pages in the TT object to those workers as well. It might
even be a good idea to use a separate wq for this case.
>
> If those assumptions all make sense, patches 1-3 are
>
> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Thanks,
Christian.
>
>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>> drivers/gpu/drm/i915/i915_gem.c | 2 +-
>> drivers/gpu/drm/i915/intel_region_ttm.c | 2 +-
>> drivers/gpu/drm/ttm/ttm_bo.c | 112 ++++++++-------------
>> drivers/gpu/drm/ttm/ttm_bo_util.c | 1 -
>> drivers/gpu/drm/ttm/ttm_device.c | 24 ++---
>> include/drm/ttm/ttm_bo_api.h | 18 +---
>> include/drm/ttm/ttm_device.h | 7 +-
>> 8 files changed, 57 insertions(+), 111 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 2b1db37e25c1..74ccbd566777 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3984,7 +3984,7 @@ void amdgpu_device_fini_hw(struct amdgpu_device
>> *adev)
>> amdgpu_fence_driver_hw_fini(adev);
>> if (adev->mman.initialized)
>> - flush_delayed_work(&adev->mman.bdev.wq);
>> + drain_workqueue(adev->mman.bdev.wq);
>> if (adev->pm_sysfs_en)
>> amdgpu_pm_sysfs_fini(adev);
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>> b/drivers/gpu/drm/i915/i915_gem.c
>> index 8468ca9885fd..c38306f156d6 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -1099,7 +1099,7 @@ void i915_gem_drain_freed_objects(struct
>> drm_i915_private *i915)
>> {
>> while (atomic_read(&i915->mm.free_count)) {
>> flush_work(&i915->mm.free_work);
>> - flush_delayed_work(&i915->bdev.wq);
>> + drain_workqueue(i915->bdev.wq);
>> rcu_barrier();
>> }
>> }
>> diff --git a/drivers/gpu/drm/i915/intel_region_ttm.c
>> b/drivers/gpu/drm/i915/intel_region_ttm.c
>> index cf89d0c2a2d9..657bbc16a48a 100644
>> --- a/drivers/gpu/drm/i915/intel_region_ttm.c
>> +++ b/drivers/gpu/drm/i915/intel_region_ttm.c
>> @@ -132,7 +132,7 @@ int intel_region_ttm_fini(struct
>> intel_memory_region *mem)
>> break;
>> msleep(20);
>> - flush_delayed_work(&mem->i915->bdev.wq);
>> + drain_workqueue(mem->i915->bdev.wq);
>> }
>> /* If we leaked objects, Don't free the region causing use
>> after free */
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index b77262a623e0..4749b65bedc4 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -280,14 +280,13 @@ static int ttm_bo_cleanup_refs(struct
>> ttm_buffer_object *bo,
>> ret = 0;
>> }
>> - if (ret || unlikely(list_empty(&bo->ddestroy))) {
>> + if (ret) {
>> if (unlock_resv)
>> dma_resv_unlock(bo->base.resv);
>> spin_unlock(&bo->bdev->lru_lock);
>> return ret;
>> }
>> - list_del_init(&bo->ddestroy);
>> spin_unlock(&bo->bdev->lru_lock);
>> ttm_bo_cleanup_memtype_use(bo);
>> @@ -300,47 +299,21 @@ static int ttm_bo_cleanup_refs(struct
>> ttm_buffer_object *bo,
>> }
>> /*
>> - * Traverse the delayed list, and call ttm_bo_cleanup_refs on all
>> - * encountered buffers.
>> + * Block for the dma_resv object to become idle, lock the buffer and
>> clean up
>> + * the resource and tt object.
>> */
>> -bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all)
>> +static void ttm_bo_delayed_delete(struct work_struct *work)
>> {
>> - struct list_head removed;
>> - bool empty;
>> -
>> - INIT_LIST_HEAD(&removed);
>> -
>> - spin_lock(&bdev->lru_lock);
>> - while (!list_empty(&bdev->ddestroy)) {
>> - struct ttm_buffer_object *bo;
>> -
>> - bo = list_first_entry(&bdev->ddestroy, struct
>> ttm_buffer_object,
>> - ddestroy);
>> - list_move_tail(&bo->ddestroy, &removed);
>> - if (!ttm_bo_get_unless_zero(bo))
>> - continue;
>> -
>> - if (remove_all || bo->base.resv != &bo->base._resv) {
>> - spin_unlock(&bdev->lru_lock);
>> - dma_resv_lock(bo->base.resv, NULL);
>> -
>> - spin_lock(&bdev->lru_lock);
>> - ttm_bo_cleanup_refs(bo, false, !remove_all, true);
>> -
>> - } else if (dma_resv_trylock(bo->base.resv)) {
>> - ttm_bo_cleanup_refs(bo, false, !remove_all, true);
>> - } else {
>> - spin_unlock(&bdev->lru_lock);
>> - }
>> + struct ttm_buffer_object *bo;
>> - ttm_bo_put(bo);
>> - spin_lock(&bdev->lru_lock);
>> - }
>> - list_splice_tail(&removed, &bdev->ddestroy);
>> - empty = list_empty(&bdev->ddestroy);
>> - spin_unlock(&bdev->lru_lock);
>> + bo = container_of(work, typeof(*bo), delayed_delete);
>> - return empty;
>> + dma_resv_wait_timeout(bo->base.resv, DMA_RESV_USAGE_BOOKKEEP,
>> false,
>> + MAX_SCHEDULE_TIMEOUT);
>> + dma_resv_lock(bo->base.resv, NULL);
>> + ttm_bo_cleanup_memtype_use(bo);
>> + dma_resv_unlock(bo->base.resv);
>> + ttm_bo_put(bo);
>> }
>> static void ttm_bo_release(struct kref *kref)
>> @@ -369,44 +342,40 @@ static void ttm_bo_release(struct kref *kref)
>> drm_vma_offset_remove(bdev->vma_manager,
>> &bo->base.vma_node);
>> ttm_mem_io_free(bdev, bo->resource);
>> - }
>> -
>> - if (!dma_resv_test_signaled(bo->base.resv,
>> DMA_RESV_USAGE_BOOKKEEP) ||
>> - !dma_resv_trylock(bo->base.resv)) {
>> - /* The BO is not idle, resurrect it for delayed destroy */
>> - ttm_bo_flush_all_fences(bo);
>> - bo->deleted = true;
>> - spin_lock(&bo->bdev->lru_lock);
>> + if (!dma_resv_test_signaled(bo->base.resv,
>> + DMA_RESV_USAGE_BOOKKEEP) ||
>> + !dma_resv_trylock(bo->base.resv)) {
>> + /* The BO is not idle, resurrect it for delayed destroy */
>> + ttm_bo_flush_all_fences(bo);
>> + bo->deleted = true;
>> - /*
>> - * Make pinned bos immediately available to
>> - * shrinkers, now that they are queued for
>> - * destruction.
>> - *
>> - * FIXME: QXL is triggering this. Can be removed when the
>> - * driver is fixed.
>> - */
>> - if (bo->pin_count) {
>> - bo->pin_count = 0;
>> - ttm_resource_move_to_lru_tail(bo->resource);
>> - }
>> + spin_lock(&bo->bdev->lru_lock);
>> - kref_init(&bo->kref);
>> - list_add_tail(&bo->ddestroy, &bdev->ddestroy);
>> - spin_unlock(&bo->bdev->lru_lock);
>> + /*
>> + * Make pinned bos immediately available to
>> + * shrinkers, now that they are queued for
>> + * destruction.
>> + *
>> + * FIXME: QXL is triggering this. Can be removed when the
>> + * driver is fixed.
>> + */
>> + if (bo->pin_count) {
>> + bo->pin_count = 0;
>> + ttm_resource_move_to_lru_tail(bo->resource);
>> + }
>> - schedule_delayed_work(&bdev->wq,
>> - ((HZ / 100) < 1) ? 1 : HZ / 100);
>> - return;
>> - }
>> + kref_init(&bo->kref);
>> + spin_unlock(&bo->bdev->lru_lock);
>> - spin_lock(&bo->bdev->lru_lock);
>> - list_del(&bo->ddestroy);
>> - spin_unlock(&bo->bdev->lru_lock);
>> + INIT_WORK(&bo->delayed_delete, ttm_bo_delayed_delete);
>> + queue_work(bdev->wq, &bo->delayed_delete);
>> + return;
>> + }
>> - ttm_bo_cleanup_memtype_use(bo);
>> - dma_resv_unlock(bo->base.resv);
>> + ttm_bo_cleanup_memtype_use(bo);
>> + dma_resv_unlock(bo->base.resv);
>> + }
>> atomic_dec(&ttm_glob.bo_count);
>> bo->destroy(bo);
>> @@ -946,7 +915,6 @@ int ttm_bo_init_reserved(struct ttm_device *bdev,
>> struct ttm_buffer_object *bo,
>> int ret;
>> kref_init(&bo->kref);
>> - INIT_LIST_HEAD(&bo->ddestroy);
>> bo->bdev = bdev;
>> bo->type = type;
>> bo->page_alignment = alignment;
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c
>> b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> index ba3aa0a0fc43..ae4b7922ee1a 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> @@ -230,7 +230,6 @@ static int ttm_buffer_object_transfer(struct
>> ttm_buffer_object *bo,
>> */
>> atomic_inc(&ttm_glob.bo_count);
>> - INIT_LIST_HEAD(&fbo->base.ddestroy);
>> drm_vma_node_reset(&fbo->base.base.vma_node);
>> kref_init(&fbo->base.kref);
>> diff --git a/drivers/gpu/drm/ttm/ttm_device.c
>> b/drivers/gpu/drm/ttm/ttm_device.c
>> index e7147e304637..e9bedca4dfdc 100644
>> --- a/drivers/gpu/drm/ttm/ttm_device.c
>> +++ b/drivers/gpu/drm/ttm/ttm_device.c
>> @@ -175,16 +175,6 @@ int ttm_device_swapout(struct ttm_device *bdev,
>> struct ttm_operation_ctx *ctx,
>> }
>> EXPORT_SYMBOL(ttm_device_swapout);
>> -static void ttm_device_delayed_workqueue(struct work_struct *work)
>> -{
>> - struct ttm_device *bdev =
>> - container_of(work, struct ttm_device, wq.work);
>> -
>> - if (!ttm_bo_delayed_delete(bdev, false))
>> - schedule_delayed_work(&bdev->wq,
>> - ((HZ / 100) < 1) ? 1 : HZ / 100);
>> -}
>> -
>> /**
>> * ttm_device_init
>> *
>> @@ -215,15 +205,19 @@ int ttm_device_init(struct ttm_device *bdev,
>> struct ttm_device_funcs *funcs,
>> if (ret)
>> return ret;
>> + bdev->wq = alloc_workqueue("ttm", WQ_MEM_RECLAIM | WQ_HIGHPRI,
>> 16);
>> + if (!bdev->wq) {
>> + ttm_global_release();
>> + return -ENOMEM;
>> + }
>> +
>> bdev->funcs = funcs;
>> ttm_sys_man_init(bdev);
>> ttm_pool_init(&bdev->pool, dev, use_dma_alloc, use_dma32);
>> bdev->vma_manager = vma_manager;
>> - INIT_DELAYED_WORK(&bdev->wq, ttm_device_delayed_workqueue);
>> spin_lock_init(&bdev->lru_lock);
>> - INIT_LIST_HEAD(&bdev->ddestroy);
>> INIT_LIST_HEAD(&bdev->pinned);
>> bdev->dev_mapping = mapping;
>> mutex_lock(&ttm_global_mutex);
>> @@ -247,10 +241,8 @@ void ttm_device_fini(struct ttm_device *bdev)
>> list_del(&bdev->device_list);
>> mutex_unlock(&ttm_global_mutex);
>> - cancel_delayed_work_sync(&bdev->wq);
>> -
>> - if (ttm_bo_delayed_delete(bdev, true))
>> - pr_debug("Delayed destroy list was clean\n");
>> + drain_workqueue(bdev->wq);
>> + destroy_workqueue(bdev->wq);
>> spin_lock(&bdev->lru_lock);
>> for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
>> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
>> index 7758347c461c..69e62bbb01e3 100644
>> --- a/include/drm/ttm/ttm_bo_api.h
>> +++ b/include/drm/ttm/ttm_bo_api.h
>> @@ -92,7 +92,6 @@ struct ttm_tt;
>> * @ttm: TTM structure holding system pages.
>> * @evicted: Whether the object was evicted without user-space
>> knowing.
>> * @deleted: True if the object is only a zombie and already deleted.
>> - * @ddestroy: List head for the delayed destroy list.
>> * @swap: List head for swap LRU list.
>> * @offset: The current GPU offset, which can have different meanings
>> * depending on the memory type. For SYSTEM type memory, it should
>> be 0.
>> @@ -135,19 +134,14 @@ struct ttm_buffer_object {
>> struct ttm_tt *ttm;
>> bool deleted;
>> struct ttm_lru_bulk_move *bulk_move;
>> + unsigned priority;
>> + unsigned pin_count;
>> /**
>> - * Members protected by the bdev::lru_lock.
>> - */
>> -
>> - struct list_head ddestroy;
>> -
>> - /**
>> - * Members protected by a bo reservation.
>> + * @delayed_delete: Work item used when we can't delete the BO
>> + * immediately
>> */
>> -
>> - unsigned priority;
>> - unsigned pin_count;
>> + struct work_struct delayed_delete;
>> /**
>> * Special members that are protected by the reserve lock
>> @@ -448,8 +442,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma);
>> int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
>> void *buf, int len, int write);
>> -bool ttm_bo_delayed_delete(struct ttm_device *bdev, bool remove_all);
>> -
>> vm_fault_t ttm_bo_vm_dummy_page(struct vm_fault *vmf, pgprot_t prot);
>> #endif
>> diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h
>> index 95b3c04b1ab9..4f3e81eac6f3 100644
>> --- a/include/drm/ttm/ttm_device.h
>> +++ b/include/drm/ttm/ttm_device.h
>> @@ -251,11 +251,6 @@ struct ttm_device {
>> */
>> spinlock_t lru_lock;
>> - /**
>> - * @ddestroy: Destroyed but not yet cleaned up buffer objects.
>> - */
>> - struct list_head ddestroy;
>> -
>> /**
>> * @pinned: Buffer objects which are pinned and so not on any
>> LRU list.
>> */
>> @@ -270,7 +265,7 @@ struct ttm_device {
>> /**
>> * @wq: Work queue structure for the delayed delete workqueue.
>> */
>> - struct delayed_work wq;
>> + struct workqueue_struct *wq;
>> };
>> int ttm_global_swapout(struct ttm_operation_ctx *ctx, gfp_t
>> gfp_flags);
next prev parent reply other threads:[~2022-12-05 13:39 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-25 10:21 [PATCH 1/9] drm/amdgpu: generally allow over-commit during BO allocation Christian König
2022-11-25 10:21 ` [PATCH 2/9] drm/ttm: remove ttm_bo_(un)lock_delayed_workqueue Christian König
2022-11-25 10:21 ` [PATCH 3/9] drm/ttm: use per BO cleanup workers Christian König
2022-11-29 21:14 ` Felix Kuehling
2022-12-05 13:39 ` Christian König [this message]
2023-06-13 13:05 ` Karol Herbst
2023-06-13 13:59 ` Christian König
2023-06-13 14:18 ` Karol Herbst
2023-06-15 11:19 ` Christian König
2023-06-15 12:04 ` Karol Herbst
2022-11-25 10:21 ` [PATCH 4/9] drm/ttm: merge ttm_bo_api.h and ttm_bo_driver.h Christian König
2022-11-25 12:43 ` kernel test robot
2022-11-25 21:19 ` kernel test robot
2022-11-25 10:21 ` [PATCH 5/9] drm/nouveau: stop using ttm_bo_wait Christian König
2022-11-25 10:21 ` [PATCH 6/9] drm/qxl: " Christian König
2022-12-15 14:19 ` Christian König
2022-12-15 20:09 ` Dave Airlie
2022-11-25 10:21 ` [PATCH 7/9] drm/i915: " Christian König
2022-11-25 11:14 ` [Intel-gfx] " Tvrtko Ursulin
2022-11-25 12:46 ` Christian König
2022-11-29 18:05 ` Matthew Auld
2022-11-30 13:02 ` Tvrtko Ursulin
2022-11-30 14:06 ` Daniel Vetter
2022-12-05 19:58 ` Christian König
2022-12-06 18:03 ` Matthew Auld
2022-12-06 18:06 ` Christian König
2022-11-25 10:21 ` [PATCH 8/9] drm/ttm: use ttm_bo_wait_ctx instead of ttm_bo_wait Christian König
2022-11-25 10:21 ` [PATCH 9/9] drm/ttm: move ttm_bo_wait into VMWGFX Christian König
2022-11-25 18:18 ` [PATCH 1/9] drm/amdgpu: generally allow over-commit during BO allocation Alex Deucher
2022-12-05 13:41 ` Christian König
2022-11-28 6:00 ` Arunpravin Paneer Selvam
2022-12-10 6:15 ` Felix Kuehling
2022-12-10 14:12 ` Christian König
2022-12-11 1:13 ` Felix Kuehling
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a7c5f157-ff42-4e87-cc79-33ba6a15a138@amd.com \
--to=christian.koenig@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=felix.kuehling@amd.com \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox