From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
To: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Intel Graphics Development <intel-gfx@lists.freedesktop.org>,
ML dri-devel <dri-devel@lists.freedesktop.org>
Subject: Re: [Intel-gfx] [PATCH 18/28] drm/i915: Take trylock during eviction, v2.
Date: Fri, 22 Oct 2021 10:44:30 +0200 [thread overview]
Message-ID: <65b5f267-574e-3c9c-b518-c701b821232a@linux.intel.com> (raw)
In-Reply-To: <CAM0jSHNq0RrTrG3hjiBz05QEokGS8YN8=YbwQ7UgWm+S=L-0hg@mail.gmail.com>
Op 21-10-2021 om 19:59 schreef Matthew Auld:
> On Thu, 21 Oct 2021 at 11:37, Maarten Lankhorst
> <maarten.lankhorst@linux.intel.com> wrote:
>> Now that freeing objects takes the object lock when destroying the
>> backing pages, we can confidently take the object lock even for dead
>> objects.
>>
>> Use this fact to take the object lock in the shrinker, without requiring
>> a reference to the object, so all calls to unbind take the object lock.
>>
>> This is the last step to requiring the object lock for vma_unbind.
> For the eviction what is the reason for only trylock here, assuming we
> are given a ww context? Maybe the back off is annoying? And the full
> lock version comes later?
2 reasons:
1. We can't take the full lock, because we already hold vm->mutex, which may be held inside dma_resv_lock. This inverts the locking, and is also why we could not keep obj->mm.lock. Until locking for vm is reworked, you cannot do this anyway.
Lockdep will complain about the following lock cycle: dma_resv_lock -> vm->mutex -> dma_resv_lock, and will eventually deadlock.
2. Until locking or delayed destroy is reworked, we cannot call a blocking dma_resv_lock for objects in the list when the refcount may be 0.
"[PATCH 25/28] drm/i915: Require object lock when freeing pages during destruction".
When destroying the object, we will take dma_resv_lock in blocking mode one last time, then unbind all its vma's. The fact we're holding vm->mutex prevents the object from disappearing, because its vma is not yet unbound. This is how we can get away with unbinding dead objects currently, before and after the changes. This also means we can only trylock, because we can only trylock inside vm->mutex.
If we start reworking vm locking, we may need to handle waiting on dead objects better. It's worth noting that TTM has to handle the exact same race, which can be seen inside ttm_bo_cleanup_refs().
>> Changes since v1:
>> - No longer require the refcount, as every freed object now holds the lock
>> when unbinding VMA's.
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> ---
>> drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 6 ++++
>> drivers/gpu/drm/i915/i915_gem_evict.c | 34 +++++++++++++++++---
>> 2 files changed, 35 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
>> index d3f29a66cb36..34c12e5983eb 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
>> @@ -403,12 +403,18 @@ i915_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr
>> list_for_each_entry_safe(vma, next,
>> &i915->ggtt.vm.bound_list, vm_link) {
>> unsigned long count = vma->node.size >> PAGE_SHIFT;
>> + struct drm_i915_gem_object *obj = vma->obj;
>>
>> if (!vma->iomap || i915_vma_is_active(vma))
>> continue;
>>
>> + if (!i915_gem_object_trylock(obj))
>> + continue;
>> +
>> if (__i915_vma_unbind(vma) == 0)
>> freed_pages += count;
>> +
>> + i915_gem_object_unlock(obj);
>> }
>> mutex_unlock(&i915->ggtt.vm.mutex);
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
>> index 2b73ddb11c66..286efa462eca 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
>> @@ -58,6 +58,9 @@ mark_free(struct drm_mm_scan *scan,
>> if (i915_vma_is_pinned(vma))
>> return false;
>>
>> + if (!i915_gem_object_trylock(vma->obj))
>> + return false;
>> +
>> list_add(&vma->evict_link, unwind);
>> return drm_mm_scan_add_block(scan, &vma->node);
>> }
>> @@ -178,6 +181,7 @@ i915_gem_evict_something(struct i915_address_space *vm,
>> list_for_each_entry_safe(vma, next, &eviction_list, evict_link) {
>> ret = drm_mm_scan_remove_block(&scan, &vma->node);
>> BUG_ON(ret);
>> + i915_gem_object_unlock(vma->obj);
>> }
>>
>> /*
>> @@ -222,10 +226,12 @@ i915_gem_evict_something(struct i915_address_space *vm,
>> * of any of our objects, thus corrupting the list).
>> */
>> list_for_each_entry_safe(vma, next, &eviction_list, evict_link) {
>> - if (drm_mm_scan_remove_block(&scan, &vma->node))
>> + if (drm_mm_scan_remove_block(&scan, &vma->node)) {
>> __i915_vma_pin(vma);
>> - else
>> + } else {
>> list_del(&vma->evict_link);
>> + i915_gem_object_unlock(vma->obj);
>> + }
>> }
>>
>> /* Unbinding will emit any required flushes */
>> @@ -234,16 +240,22 @@ i915_gem_evict_something(struct i915_address_space *vm,
>> __i915_vma_unpin(vma);
>> if (ret == 0)
>> ret = __i915_vma_unbind(vma);
>> +
>> + i915_gem_object_unlock(vma->obj);
>> }
>>
>> while (ret == 0 && (node = drm_mm_scan_color_evict(&scan))) {
>> vma = container_of(node, struct i915_vma, node);
>>
>> +
>> /* If we find any non-objects (!vma), we cannot evict them */
>> - if (vma->node.color != I915_COLOR_UNEVICTABLE)
>> + if (vma->node.color != I915_COLOR_UNEVICTABLE &&
>> + i915_gem_object_trylock(vma->obj)) {
>> ret = __i915_vma_unbind(vma);
>> - else
>> - ret = -ENOSPC; /* XXX search failed, try again? */
>> + i915_gem_object_unlock(vma->obj);
>> + } else {
>> + ret = -ENOSPC;
>> + }
>> }
>>
>> return ret;
>> @@ -333,6 +345,11 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
>> break;
>> }
>>
>> + if (!i915_gem_object_trylock(vma->obj)) {
>> + ret = -ENOSPC;
>> + break;
>> + }
>> +
>> /*
>> * Never show fear in the face of dragons!
>> *
>> @@ -350,6 +367,8 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
>> __i915_vma_unpin(vma);
>> if (ret == 0)
>> ret = __i915_vma_unbind(vma);
>> +
>> + i915_gem_object_unlock(vma->obj);
>> }
>>
>> return ret;
>> @@ -393,6 +412,9 @@ int i915_gem_evict_vm(struct i915_address_space *vm)
>> if (i915_vma_is_pinned(vma))
>> continue;
>>
>> + if (!i915_gem_object_trylock(vma->obj))
>> + continue;
>> +
>> __i915_vma_pin(vma);
>> list_add(&vma->evict_link, &eviction_list);
>> }
>> @@ -406,6 +428,8 @@ int i915_gem_evict_vm(struct i915_address_space *vm)
>> ret = __i915_vma_unbind(vma);
>> if (ret != -EINTR) /* "Get me out of here!" */
>> ret = 0;
>> +
>> + i915_gem_object_unlock(vma->obj);
>> }
>> } while (ret == 0);
>>
>> --
>> 2.33.0
>>
next prev parent reply other threads:[~2021-10-22 8:44 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-21 10:35 [Intel-gfx] [PATCH 01/28] drm/i915: Fix i915_request fence wait semantics Maarten Lankhorst
2021-10-21 10:35 ` [Intel-gfx] [PATCH 02/28] drm/i915: use new iterator in i915_gem_object_wait_reservation Maarten Lankhorst
2021-10-21 10:38 ` Christian König
2021-10-21 11:06 ` Maarten Lankhorst
2021-10-21 11:13 ` Tvrtko Ursulin
2021-10-28 8:41 ` Christian König
2021-10-28 15:30 ` Daniel Vetter
2021-11-01 9:41 ` Tvrtko Ursulin
2021-11-11 11:36 ` Christian König
2021-11-12 16:07 ` Daniel Vetter
2021-10-21 10:35 ` [Intel-gfx] [PATCH 03/28] drm/i915: Remove dma_resv_prune Maarten Lankhorst
2021-10-21 14:43 ` Matthew Auld
2021-10-22 8:48 ` Maarten Lankhorst
2021-10-21 10:35 ` [Intel-gfx] [PATCH 04/28] drm/i915: Remove unused bits of i915_vma/active api Maarten Lankhorst
2021-10-21 10:35 ` [Intel-gfx] [PATCH 05/28] drm/i915: Slightly rework EXEC_OBJECT_CAPTURE handling, v2 Maarten Lankhorst
2021-10-21 10:35 ` [Intel-gfx] [PATCH 06/28] drm/i915: Remove gen6_ppgtt_unpin_all Maarten Lankhorst
2021-10-21 10:35 ` [Intel-gfx] [PATCH 07/28] drm/i915: Create a dummy object for gen6 ppgtt Maarten Lankhorst
2021-10-21 15:42 ` Matthew Auld
2021-10-21 10:35 ` [Intel-gfx] [PATCH 08/28] drm/i915: Create a full object for mock_ring, v2 Maarten Lankhorst
2021-10-21 15:57 ` Matthew Auld
2021-10-22 11:03 ` Maarten Lankhorst
2021-10-21 10:35 ` [Intel-gfx] [PATCH 09/28] drm/i915: vma is always backed by an object Maarten Lankhorst
2021-10-21 16:09 ` Matthew Auld
2021-10-21 10:35 ` [Intel-gfx] [PATCH 10/28] drm/i915: Change shrink ordering to use locking around unbinding Maarten Lankhorst
2021-10-21 16:12 ` Matthew Auld
2021-10-22 11:04 ` Maarten Lankhorst
2021-10-21 10:35 ` [Intel-gfx] [PATCH 11/28] drm/i915/pm: Move CONTEXT_VALID_BIT check Maarten Lankhorst
2021-11-02 16:13 ` Matthew Auld
2021-10-21 10:35 ` [Intel-gfx] [PATCH 12/28] drm/i915: Remove resv from i915_vma Maarten Lankhorst
2021-10-21 10:35 ` [Intel-gfx] [PATCH 13/28] drm/i915: Remove pages_mutex and intel_gtt->vma_ops.set/clear_pages members Maarten Lankhorst
2021-10-21 17:30 ` Matthew Auld
2021-10-22 10:59 ` Matthew Auld
2021-11-29 12:40 ` Maarten Lankhorst
2021-10-21 10:35 ` [Intel-gfx] [PATCH 14/28] drm/i915: Take object lock in i915_ggtt_pin if ww is not set Maarten Lankhorst
2021-10-21 17:39 ` Matthew Auld
2021-11-29 12:46 ` Maarten Lankhorst
2021-10-21 10:35 ` [Intel-gfx] [PATCH 15/28] drm/i915: Add lock for unbinding to i915_gem_object_ggtt_pin_ww Maarten Lankhorst
2021-10-21 17:48 ` Matthew Auld
2021-11-29 13:25 ` Maarten Lankhorst
2021-10-21 10:35 ` [Intel-gfx] [PATCH 16/28] drm/i915: Rework context handling in hugepages selftests Maarten Lankhorst
2021-10-21 17:55 ` Matthew Auld
2021-10-22 6:51 ` kernel test robot
2021-10-22 7:21 ` kernel test robot
2021-10-21 10:35 ` [Intel-gfx] [PATCH 17/28] drm/i915: Ensure gem_contexts selftests work with unbind changes Maarten Lankhorst
2021-10-21 10:35 ` [Intel-gfx] [PATCH 18/28] drm/i915: Take trylock during eviction, v2 Maarten Lankhorst
2021-10-21 17:59 ` Matthew Auld
2021-10-22 8:44 ` Maarten Lankhorst [this message]
2021-10-21 10:35 ` [Intel-gfx] [PATCH 19/28] drm/i915: Pass trylock context to callers Maarten Lankhorst
2021-10-21 18:03 ` Matthew Auld
2021-10-22 8:52 ` Maarten Lankhorst
2021-10-21 10:35 ` [Intel-gfx] [PATCH 20/28] drm/i915: Ensure i915_vma tests do not get -ENOSPC with the locking changes Maarten Lankhorst
2021-10-21 10:35 ` [Intel-gfx] [PATCH 21/28] drm/i915: Drain the ttm delayed workqueue too Maarten Lankhorst
2021-10-25 15:11 ` Matthew Auld
2021-10-21 10:35 ` [Intel-gfx] [PATCH 22/28] drm/i915: Make i915_gem_evict_vm work correctly for already locked objects Maarten Lankhorst
2021-10-21 10:36 ` [Intel-gfx] [PATCH 23/28] drm/i915: Call i915_gem_evict_vm in vm_fault_gtt to prevent new ENOSPC errors Maarten Lankhorst
2021-10-21 10:36 ` [Intel-gfx] [PATCH 24/28] drm/i915: Add i915_vma_unbind_unlocked, and take obj lock for i915_vma_unbind Maarten Lankhorst
2021-10-21 10:36 ` [Intel-gfx] [PATCH 25/28] drm/i915: Require object lock when freeing pages during destruction Maarten Lankhorst
2021-10-22 11:10 ` Matthew Auld
2021-10-21 10:36 ` [Intel-gfx] [PATCH 26/28] drm/i915: Remove assert_object_held_shared Maarten Lankhorst
2021-10-21 10:36 ` [Intel-gfx] [PATCH 27/28] drm/i915: Remove support for unlocked i915_vma unbind Maarten Lankhorst
2021-10-21 10:36 ` [Intel-gfx] [PATCH 28/28] drm/i915: Remove short-term pins from execbuf, v4 Maarten Lankhorst
2021-10-25 15:02 ` Matthew Auld
2021-11-29 13:44 ` Maarten Lankhorst
2021-10-21 10:56 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/28] drm/i915: Fix i915_request fence wait semantics Patchwork
2021-10-21 10:57 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-10-21 11:01 ` [Intel-gfx] ✗ Fi.CI.DOCS: " Patchwork
2021-10-21 11:27 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-10-21 12:57 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2021-10-22 11:24 ` Matthew Auld
2021-10-22 13:17 ` Maarten Lankhorst
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=65b5f267-574e-3c9c-b518-c701b821232a@linux.intel.com \
--to=maarten.lankhorst@linux.intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=matthew.william.auld@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox