From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Airlie Subject: [PATCH] drm/ttm: remove race and optimise evicting destroyed buffers. Date: Wed, 29 Sep 2010 13:45:30 +1000 Message-ID: <1285731930-20167-1-git-send-email-airlied@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by gabe.freedesktop.org (Postfix) with ESMTP id 2918E9E908 for ; Tue, 28 Sep 2010 20:45:33 -0700 (PDT) Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o8T3jWru005654 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 28 Sep 2010 23:45:32 -0400 Received: from clockmaker-el6.bne.redhat.com (dhcp-0-222.bne.redhat.com [10.64.0.222]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o8T3jUlI017222 for ; Tue, 28 Sep 2010 23:45:31 -0400 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org Errors-To: dri-devel-bounces+sf-dri-devel=m.gmane.org@lists.freedesktop.org To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org From: Dave Airlie A race condition was identifed that led to a leak of the BOs, when a bo was on the delayed delete list and was selected for eviction, the code would enter thread 1 (evict) -- thread2 (dd) bo added to delayed destroy list take lru lock ttm_mem_evict_first called - reserve locked - remove from lru - drop lru lock take lru lock try del from lru - get put_count == 0 try to reserve locked - drop lru lock to wait unreserved call bo_evict unreserve buffer - take lru lock - add back to lru - unreserver - drop lru lock take lru lock due to unreserved unbind ttm drop put_count references (which is 0) despite being on the lru/swap lru lists. leak due to never losing reference The obvious quick fix is to try the bo from the ddestroy list if we are going to evict it, however if we are going to evict a buffer that is going to be destroyed we should just destroy it instead, its more likely to be a lighter-weight operation than evict + delayed destroy. This patch check is a bo that we are about to evict is actually on the destroy list and if it is, it unreserves it (without adding to lru), and cleans up its references. It should then get destroyed via normal ref counting means. proposed fix for: https://bugzilla.redhat.com/show_bug.cgi?id=615505 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=591061 Signed-off-by: Dave Airlie --- drivers/gpu/drm/ttm/ttm_bo.c | 18 +++++++++++++++--- 1 files changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index cb4cf7e..60689b1 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -689,7 +689,7 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, struct ttm_mem_type_manager *man = &bdev->man[mem_type]; struct ttm_buffer_object *bo; int ret, put_count = 0; - + bool destroy = false; retry: spin_lock(&glob->lru_lock); if (list_empty(&man->lru)) { @@ -719,6 +719,13 @@ retry: } put_count = ttm_bo_del_from_lru(bo); + + /* is the buffer currently on the delayed destroy list? */ + if (!list_empty(&bo->ddestroy)) { + list_del_init(&bo->ddestroy); + destroy = true; + put_count++; + } spin_unlock(&glob->lru_lock); BUG_ON(ret != 0); @@ -726,8 +733,13 @@ retry: while (put_count--) kref_put(&bo->list_kref, ttm_bo_ref_bug); - ret = ttm_bo_evict(bo, interruptible, no_wait_reserve, no_wait_gpu); - ttm_bo_unreserve(bo); + if (destroy) { + atomic_set(&bo->reserved, 0); + ret = ttm_bo_cleanup_refs(bo, !no_wait_gpu); + } else { + ret = ttm_bo_evict(bo, interruptible, no_wait_reserve, no_wait_gpu); + ttm_bo_unreserve(bo); + } kref_put(&bo->list_kref, ttm_bo_release_list); return ret; -- 1.7.2.3