* [PATCH 0/6] drm/ttm: lru lock atomicity removal
@ 2012-11-30 12:11 Maarten Lankhorst
2012-11-30 12:12 ` [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve Maarten Lankhorst
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Maarten Lankhorst @ 2012-11-30 12:11 UTC (permalink / raw)
To: Thomas Hellstrom; +Cc: dri-devel
With all the previous patches there shouldn't be anything lying on the reservations being atomic
with removal of the bo's from the lru lists any more.
As such we can finally fix the locking primitives and make it act like normal mutex calls.
Patch 1 is the actual removal of the guarantee in ttm_bo_reserve
patch 2 is a cleanup of ttm_eu_backoff_reservation from removing that guarantee
patch 3...6 are introducing ttm_bo_reserve_slowpath.
After this series, this should directly map to my proposed extensions to mutex.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve
2012-11-30 12:11 [PATCH 0/6] drm/ttm: lru lock atomicity removal Maarten Lankhorst
@ 2012-11-30 12:12 ` Maarten Lankhorst
2012-11-30 12:12 ` [PATCH 2/6] drm/ttm: cleanup ttm_eu_reserve_buffers handling Maarten Lankhorst
` (5 more replies)
2012-11-30 15:59 ` [PATCH 0/6] drm/ttm: lru lock atomicity removal Thomas Hellstrom
2012-12-06 1:15 ` Jerome Glisse
2 siblings, 6 replies; 15+ messages in thread
From: Maarten Lankhorst @ 2012-11-30 12:12 UTC (permalink / raw)
To: dri-devel, thellstrom
There should no longer be assumptions that reserve will always succeed
with the lru lock held, so we can safely break the whole atomic
reserve/lru thing. As a bonus this fixes most lockdep annotations for
reservations.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---
drivers/gpu/drm/ttm/ttm_bo.c | 54 ++++++++++++++++++++++------------
drivers/gpu/drm/ttm/ttm_execbuf_util.c | 2 +-
include/drm/ttm/ttm_bo_driver.h | 19 +++---------
3 files changed, 40 insertions(+), 35 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 9028327..61b5cd0 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -213,14 +213,13 @@ int ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
return put_count;
}
-int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
+int ttm_bo_reserve_nolru(struct ttm_buffer_object *bo,
bool interruptible,
bool no_wait, bool use_sequence, uint32_t sequence)
{
- struct ttm_bo_global *glob = bo->glob;
int ret;
- while (unlikely(atomic_read(&bo->reserved) != 0)) {
+ while (unlikely(atomic_xchg(&bo->reserved, 1) != 0)) {
/**
* Deadlock avoidance for multi-bo reserving.
*/
@@ -241,26 +240,36 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
if (no_wait)
return -EBUSY;
- spin_unlock(&glob->lru_lock);
ret = ttm_bo_wait_unreserved(bo, interruptible);
- spin_lock(&glob->lru_lock);
if (unlikely(ret))
return ret;
}
- atomic_set(&bo->reserved, 1);
if (use_sequence) {
+ bool wake_up = false;
/**
* Wake up waiters that may need to recheck for deadlock,
* if we decreased the sequence number.
*/
if (unlikely((bo->val_seq - sequence < (1 << 31))
|| !bo->seq_valid))
- wake_up_all(&bo->event_queue);
+ wake_up = true;
+ /*
+ * In the worst case with memory ordering these values can be
+ * seen in the wrong order. However since we call wake_up_all
+ * in that case, this will hopefully not pose a problem,
+ * and the worst case would only cause someone to accidentally
+ * hit -EAGAIN in ttm_bo_reserve when they see old value of
+ * val_seq. However this would only happen if seq_valid was
+ * written before val_seq was, and just means some slightly
+ * increased cpu usage
+ */
bo->val_seq = sequence;
bo->seq_valid = true;
+ if (wake_up)
+ wake_up_all(&bo->event_queue);
} else {
bo->seq_valid = false;
}
@@ -289,14 +298,14 @@ int ttm_bo_reserve(struct ttm_buffer_object *bo,
int put_count = 0;
int ret;
- spin_lock(&glob->lru_lock);
- ret = ttm_bo_reserve_locked(bo, interruptible, no_wait, use_sequence,
- sequence);
- if (likely(ret == 0))
+ ret = ttm_bo_reserve_nolru(bo, interruptible, no_wait, use_sequence,
+ sequence);
+ if (likely(ret == 0)) {
+ spin_lock(&glob->lru_lock);
put_count = ttm_bo_del_from_lru(bo);
- spin_unlock(&glob->lru_lock);
-
- ttm_bo_list_ref_sub(bo, put_count, true);
+ spin_unlock(&glob->lru_lock);
+ ttm_bo_list_ref_sub(bo, put_count, true);
+ }
return ret;
}
@@ -510,7 +519,7 @@ static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
int ret;
spin_lock(&glob->lru_lock);
- ret = ttm_bo_reserve_locked(bo, false, true, false, 0);
+ ret = ttm_bo_reserve_nolru(bo, false, true, false, 0);
spin_lock(&bdev->fence_lock);
(void) ttm_bo_wait(bo, false, false, true);
@@ -603,7 +612,7 @@ static int ttm_bo_cleanup_refs_and_unlock(struct ttm_buffer_object *bo,
return ret;
spin_lock(&glob->lru_lock);
- ret = ttm_bo_reserve_locked(bo, false, true, false, 0);
+ ret = ttm_bo_reserve_nolru(bo, false, true, false, 0);
/*
* We raced, and lost, someone else holds the reservation now,
@@ -667,7 +676,14 @@ static int ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all)
kref_get(&nentry->list_kref);
}
- ret = ttm_bo_reserve_locked(entry, false, !remove_all, false, 0);
+ ret = ttm_bo_reserve_nolru(entry, false, true, false, 0);
+ if (remove_all && ret) {
+ spin_unlock(&glob->lru_lock);
+ ret = ttm_bo_reserve_nolru(entry, false, false,
+ false, 0);
+ spin_lock(&glob->lru_lock);
+ }
+
if (!ret)
ret = ttm_bo_cleanup_refs_and_unlock(entry, false,
!remove_all);
@@ -818,7 +834,7 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
spin_lock(&glob->lru_lock);
list_for_each_entry(bo, &man->lru, lru) {
- ret = ttm_bo_reserve_locked(bo, false, true, false, 0);
+ ret = ttm_bo_reserve_nolru(bo, false, true, false, 0);
if (!ret)
break;
}
@@ -1799,7 +1815,7 @@ static int ttm_bo_swapout(struct ttm_mem_shrink *shrink)
spin_lock(&glob->lru_lock);
list_for_each_entry(bo, &glob->swap_lru, swap) {
- ret = ttm_bo_reserve_locked(bo, false, true, false, 0);
+ ret = ttm_bo_reserve_nolru(bo, false, true, false, 0);
if (!ret)
break;
}
diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
index cd9e452..bd37b5c 100644
--- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
+++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
@@ -153,7 +153,7 @@ retry:
struct ttm_buffer_object *bo = entry->bo;
retry_this_bo:
- ret = ttm_bo_reserve_locked(bo, true, true, true, val_seq);
+ ret = ttm_bo_reserve_nolru(bo, true, true, true, val_seq);
switch (ret) {
case 0:
break;
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index e3a43a4..6fff432 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -790,16 +790,7 @@ extern void ttm_mem_io_unlock(struct ttm_mem_type_manager *man);
* to make room for a buffer already reserved. (Buffers are reserved before
* they are evicted). The following algorithm prevents such deadlocks from
* occurring:
- * 1) Buffers are reserved with the lru spinlock held. Upon successful
- * reservation they are removed from the lru list. This stops a reserved buffer
- * from being evicted. However the lru spinlock is released between the time
- * a buffer is selected for eviction and the time it is reserved.
- * Therefore a check is made when a buffer is reserved for eviction, that it
- * is still the first buffer in the lru list, before it is removed from the
- * list. @check_lru == 1 forces this check. If it fails, the function returns
- * -EINVAL, and the caller should then choose a new buffer to evict and repeat
- * the procedure.
- * 2) Processes attempting to reserve multiple buffers other than for eviction,
+ * Processes attempting to reserve multiple buffers other than for eviction,
* (typically execbuf), should first obtain a unique 32-bit
* validation sequence number,
* and call this function with @use_sequence == 1 and @sequence == the unique
@@ -832,7 +823,7 @@ extern int ttm_bo_reserve(struct ttm_buffer_object *bo,
/**
- * ttm_bo_reserve_locked:
+ * ttm_bo_reserve_nolru:
*
* @bo: A pointer to a struct ttm_buffer_object.
* @interruptible: Sleep interruptible if waiting.
@@ -840,9 +831,7 @@ extern int ttm_bo_reserve(struct ttm_buffer_object *bo,
* @use_sequence: If @bo is already reserved, Only sleep waiting for
* it to become unreserved if @sequence < (@bo)->sequence.
*
- * Must be called with struct ttm_bo_global::lru_lock held,
- * and will not remove reserved buffers from the lru lists.
- * The function may release the LRU spinlock if it needs to sleep.
+ * Will not remove reserved buffers from the lru lists.
* Otherwise identical to ttm_bo_reserve.
*
* Returns:
@@ -855,7 +844,7 @@ extern int ttm_bo_reserve(struct ttm_buffer_object *bo,
* -EDEADLK: Bo already reserved using @sequence. This error code will only
* be returned if @use_sequence is set to true.
*/
-extern int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
+extern int ttm_bo_reserve_nolru(struct ttm_buffer_object *bo,
bool interruptible,
bool no_wait, bool use_sequence,
uint32_t sequence);
--
1.8.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/6] drm/ttm: cleanup ttm_eu_reserve_buffers handling
2012-11-30 12:12 ` [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve Maarten Lankhorst
@ 2012-11-30 12:12 ` Maarten Lankhorst
2012-11-30 12:12 ` [PATCH 3/6] drm/ttm: add ttm_bo_reserve_slowpath Maarten Lankhorst
` (4 subsequent siblings)
5 siblings, 0 replies; 15+ messages in thread
From: Maarten Lankhorst @ 2012-11-30 12:12 UTC (permalink / raw)
To: dri-devel, thellstrom
With the lru lock no longer required for protecting reservations we
can just do a ttm_bo_reserve_nolru on -EBUSY, and handle all errors
in a single path.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---
drivers/gpu/drm/ttm/ttm_execbuf_util.c | 53 ++++++++++++++--------------------
1 file changed, 21 insertions(+), 32 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
index bd37b5c..c7d3236 100644
--- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
+++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
@@ -82,22 +82,6 @@ static void ttm_eu_list_ref_sub(struct list_head *list)
}
}
-static int ttm_eu_wait_unreserved_locked(struct list_head *list,
- struct ttm_buffer_object *bo)
-{
- struct ttm_bo_global *glob = bo->glob;
- int ret;
-
- ttm_eu_del_from_lru_locked(list);
- spin_unlock(&glob->lru_lock);
- ret = ttm_bo_wait_unreserved(bo, true);
- spin_lock(&glob->lru_lock);
- if (unlikely(ret != 0))
- ttm_eu_backoff_reservation_locked(list);
- return ret;
-}
-
-
void ttm_eu_backoff_reservation(struct list_head *list)
{
struct ttm_validate_buffer *entry;
@@ -152,19 +136,23 @@ retry:
list_for_each_entry(entry, list, head) {
struct ttm_buffer_object *bo = entry->bo;
-retry_this_bo:
ret = ttm_bo_reserve_nolru(bo, true, true, true, val_seq);
switch (ret) {
case 0:
break;
case -EBUSY:
- ret = ttm_eu_wait_unreserved_locked(list, bo);
- if (unlikely(ret != 0)) {
- spin_unlock(&glob->lru_lock);
- ttm_eu_list_ref_sub(list);
- return ret;
- }
- goto retry_this_bo;
+ ttm_eu_del_from_lru_locked(list);
+ spin_unlock(&glob->lru_lock);
+ ret = ttm_bo_reserve_nolru(bo, true, false,
+ true, val_seq);
+ spin_lock(&glob->lru_lock);
+ if (!ret)
+ break;
+
+ if (unlikely(ret != -EAGAIN))
+ goto err;
+
+ /* fallthrough */
case -EAGAIN:
ttm_eu_backoff_reservation_locked(list);
spin_unlock(&glob->lru_lock);
@@ -174,18 +162,13 @@ retry_this_bo:
return ret;
goto retry;
default:
- ttm_eu_backoff_reservation_locked(list);
- spin_unlock(&glob->lru_lock);
- ttm_eu_list_ref_sub(list);
- return ret;
+ goto err;
}
entry->reserved = true;
if (unlikely(atomic_read(&bo->cpu_writers) > 0)) {
- ttm_eu_backoff_reservation_locked(list);
- spin_unlock(&glob->lru_lock);
- ttm_eu_list_ref_sub(list);
- return -EBUSY;
+ ret = -EBUSY;
+ goto err;
}
}
@@ -194,6 +177,12 @@ retry_this_bo:
ttm_eu_list_ref_sub(list);
return 0;
+
+err:
+ ttm_eu_backoff_reservation_locked(list);
+ spin_unlock(&glob->lru_lock);
+ ttm_eu_list_ref_sub(list);
+ return ret;
}
EXPORT_SYMBOL(ttm_eu_reserve_buffers);
--
1.8.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 3/6] drm/ttm: add ttm_bo_reserve_slowpath
2012-11-30 12:12 ` [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve Maarten Lankhorst
2012-11-30 12:12 ` [PATCH 2/6] drm/ttm: cleanup ttm_eu_reserve_buffers handling Maarten Lankhorst
@ 2012-11-30 12:12 ` Maarten Lankhorst
2012-11-30 12:12 ` [PATCH 4/6] drm/ttm: use ttm_bo_reserve_slowpath_nolru in ttm_eu_reserve_buffers Maarten Lankhorst
` (3 subsequent siblings)
5 siblings, 0 replies; 15+ messages in thread
From: Maarten Lankhorst @ 2012-11-30 12:12 UTC (permalink / raw)
To: dri-devel, thellstrom
Instead of dropping everything, waiting for the bo to be unreserved
and trying over, a better strategy would be to do a blocking wait.
This can be mapped a lot better to a mutex_lock-like call.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---
drivers/gpu/drm/ttm/ttm_bo.c | 47 +++++++++++++++++++++++++++++++++++++++++
include/drm/ttm/ttm_bo_driver.h | 30 ++++++++++++++++++++++++++
2 files changed, 77 insertions(+)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 61b5cd0..174b325 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -310,6 +310,53 @@ int ttm_bo_reserve(struct ttm_buffer_object *bo,
return ret;
}
+int ttm_bo_reserve_slowpath_nolru(struct ttm_buffer_object *bo,
+ bool interruptible, uint32_t sequence)
+{
+ bool wake_up = false;
+ int ret;
+
+ while (unlikely(atomic_xchg(&bo->reserved, 1) != 0)) {
+ WARN_ON(bo->seq_valid && sequence == bo->val_seq);
+
+ ret = ttm_bo_wait_unreserved(bo, interruptible);
+
+ if (unlikely(ret))
+ return ret;
+ }
+
+ if ((bo->val_seq - sequence < (1 << 31)) || !bo->seq_valid)
+ wake_up = true;
+
+ /**
+ * Wake up waiters that may need to recheck for deadlock,
+ * if we decreased the sequence number.
+ */
+ bo->val_seq = sequence;
+ bo->seq_valid = true;
+ if (wake_up)
+ wake_up_all(&bo->event_queue);
+
+ return 0;
+}
+
+int ttm_bo_reserve_slowpath(struct ttm_buffer_object *bo,
+ bool interruptible, uint32_t sequence)
+{
+ struct ttm_bo_global *glob = bo->glob;
+ int put_count, ret;
+
+ ret = ttm_bo_reserve_slowpath_nolru(bo, interruptible, sequence);
+ if (likely(!ret)) {
+ spin_lock(&glob->lru_lock);
+ put_count = ttm_bo_del_from_lru(bo);
+ spin_unlock(&glob->lru_lock);
+ ttm_bo_list_ref_sub(bo, put_count, true);
+ }
+ return ret;
+}
+EXPORT_SYMBOL(ttm_bo_reserve_slowpath);
+
void ttm_bo_unreserve_locked(struct ttm_buffer_object *bo)
{
ttm_bo_add_to_lru(bo);
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index 6fff432..5af71af 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -821,6 +821,36 @@ extern int ttm_bo_reserve(struct ttm_buffer_object *bo,
bool interruptible,
bool no_wait, bool use_sequence, uint32_t sequence);
+/**
+ * ttm_bo_reserve_slowpath_nolru:
+ * @bo: A pointer to a struct ttm_buffer_object.
+ * @interruptible: Sleep interruptible if waiting.
+ * @sequence: Set (@bo)->sequence to this value after lock
+ *
+ * This is called after ttm_bo_reserve returns -EAGAIN and we backed off
+ * from all our other reservations. Because there are no other reservations
+ * held by us, this function cannot deadlock any more.
+ *
+ * Will not remove reserved buffers from the lru lists.
+ * Otherwise identical to ttm_bo_reserve_slowpath.
+ */
+extern int ttm_bo_reserve_slowpath_nolru(struct ttm_buffer_object *bo,
+ bool interruptible,
+ uint32_t sequence);
+
+
+/**
+ * ttm_bo_reserve_slowpath:
+ * @bo: A pointer to a struct ttm_buffer_object.
+ * @interruptible: Sleep interruptible if waiting.
+ * @sequence: Set (@bo)->sequence to this value after lock
+ *
+ * This is called after ttm_bo_reserve returns -EAGAIN and we backed off
+ * from all our other reservations. Because there are no other reservations
+ * held by us, this function cannot deadlock any more.
+ */
+extern int ttm_bo_reserve_slowpath(struct ttm_buffer_object *bo,
+ bool interruptible, uint32_t sequence);
/**
* ttm_bo_reserve_nolru:
--
1.8.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 4/6] drm/ttm: use ttm_bo_reserve_slowpath_nolru in ttm_eu_reserve_buffers
2012-11-30 12:12 ` [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve Maarten Lankhorst
2012-11-30 12:12 ` [PATCH 2/6] drm/ttm: cleanup ttm_eu_reserve_buffers handling Maarten Lankhorst
2012-11-30 12:12 ` [PATCH 3/6] drm/ttm: add ttm_bo_reserve_slowpath Maarten Lankhorst
@ 2012-11-30 12:12 ` Maarten Lankhorst
2012-12-06 1:36 ` Jerome Glisse
2012-11-30 12:12 ` [PATCH 5/6] drm/nouveau: use ttm_bo_reserve_slowpath in validate_init Maarten Lankhorst
` (2 subsequent siblings)
5 siblings, 1 reply; 15+ messages in thread
From: Maarten Lankhorst @ 2012-11-30 12:12 UTC (permalink / raw)
To: dri-devel, thellstrom
This requires re-use of the seqno, which increases fairness slightly.
Instead of spinning with a new seqno every time we keep the current one,
but still drop all other reservations we hold. Only when we succeed,
we try to get back our other reservations again.
This should increase fairness slightly as well.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---
drivers/gpu/drm/ttm/ttm_execbuf_util.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
index c7d3236..c02b2b6 100644
--- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
+++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
@@ -129,13 +129,17 @@ int ttm_eu_reserve_buffers(struct list_head *list)
entry = list_first_entry(list, struct ttm_validate_buffer, head);
glob = entry->bo->glob;
-retry:
spin_lock(&glob->lru_lock);
val_seq = entry->bo->bdev->val_seq++;
+retry:
list_for_each_entry(entry, list, head) {
struct ttm_buffer_object *bo = entry->bo;
+ /* already slowpath reserved? */
+ if (entry->reserved)
+ continue;
+
ret = ttm_bo_reserve_nolru(bo, true, true, true, val_seq);
switch (ret) {
case 0:
@@ -157,9 +161,15 @@ retry:
ttm_eu_backoff_reservation_locked(list);
spin_unlock(&glob->lru_lock);
ttm_eu_list_ref_sub(list);
- ret = ttm_bo_wait_unreserved(bo, true);
+ ret = ttm_bo_reserve_slowpath_nolru(bo, true, val_seq);
if (unlikely(ret != 0))
return ret;
+ spin_lock(&glob->lru_lock);
+ entry->reserved = true;
+ if (unlikely(atomic_read(&bo->cpu_writers) > 0)) {
+ ret = -EBUSY;
+ goto err;
+ }
goto retry;
default:
goto err;
--
1.8.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 5/6] drm/nouveau: use ttm_bo_reserve_slowpath in validate_init
2012-11-30 12:12 ` [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve Maarten Lankhorst
` (2 preceding siblings ...)
2012-11-30 12:12 ` [PATCH 4/6] drm/ttm: use ttm_bo_reserve_slowpath_nolru in ttm_eu_reserve_buffers Maarten Lankhorst
@ 2012-11-30 12:12 ` Maarten Lankhorst
2012-11-30 12:13 ` [PATCH 6/6] drm/ttm: unexport ttm_bo_wait_unreserved Maarten Lankhorst
2012-12-06 1:19 ` [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve Jerome Glisse
5 siblings, 0 replies; 15+ messages in thread
From: Maarten Lankhorst @ 2012-11-30 12:12 UTC (permalink / raw)
To: dri-devel, thellstrom
Similar rationale to the identical commit in drm/ttm.
Instead of only waiting for unreservation, we make sure we actually
own the reservation, then retry to get the rest.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---
drivers/gpu/drm/nouveau/nouveau_gem.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 7b9364b..6f58604 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -321,6 +321,7 @@ validate_init(struct nouveau_channel *chan, struct drm_file *file_priv,
uint32_t sequence;
int trycnt = 0;
int ret, i;
+ struct nouveau_bo *res_bo = NULL;
sequence = atomic_add_return(1, &drm->ttm.validate_sequence);
retry:
@@ -341,6 +342,11 @@ retry:
return -ENOENT;
}
nvbo = gem->driver_private;
+ if (nvbo == res_bo) {
+ res_bo = NULL;
+ drm_gem_object_unreference_unlocked(gem);
+ continue;
+ }
if (nvbo->reserved_by && nvbo->reserved_by == file_priv) {
NV_ERROR(drm, "multiple instances of buffer %d on "
@@ -353,15 +359,18 @@ retry:
ret = ttm_bo_reserve(&nvbo->bo, true, false, true, sequence);
if (ret) {
validate_fini(op, NULL);
- if (unlikely(ret == -EAGAIN))
- ret = ttm_bo_wait_unreserved(&nvbo->bo, true);
- drm_gem_object_unreference_unlocked(gem);
+ if (unlikely(ret == -EAGAIN)) {
+ ret = ttm_bo_reserve_slowpath(&nvbo->bo, true,
+ sequence);
+ if (!ret)
+ res_bo = nvbo;
+ }
if (unlikely(ret)) {
+ drm_gem_object_unreference_unlocked(gem);
if (ret != -ERESTARTSYS)
NV_ERROR(drm, "fail reserve\n");
return ret;
}
- goto retry;
}
b->user_priv = (uint64_t)(unsigned long)nvbo;
@@ -383,6 +392,8 @@ retry:
validate_fini(op, NULL);
return -EINVAL;
}
+ if (nvbo == res_bo)
+ goto retry;
}
return 0;
--
1.8.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 6/6] drm/ttm: unexport ttm_bo_wait_unreserved
2012-11-30 12:12 ` [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve Maarten Lankhorst
` (3 preceding siblings ...)
2012-11-30 12:12 ` [PATCH 5/6] drm/nouveau: use ttm_bo_reserve_slowpath in validate_init Maarten Lankhorst
@ 2012-11-30 12:13 ` Maarten Lankhorst
2012-12-06 1:19 ` [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve Jerome Glisse
5 siblings, 0 replies; 15+ messages in thread
From: Maarten Lankhorst @ 2012-11-30 12:13 UTC (permalink / raw)
To: dri-devel, thellstrom
All legitimate users of this function outside ttm_bo.c are gone, now
it's only an implementation detail.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---
drivers/gpu/drm/ttm/ttm_bo.c | 4 ++--
include/drm/ttm/ttm_bo_driver.h | 12 ------------
2 files changed, 2 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 174b325..fd78104 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -158,7 +158,8 @@ static void ttm_bo_release_list(struct kref *list_kref)
ttm_mem_global_free(bdev->glob->mem_glob, acc_size);
}
-int ttm_bo_wait_unreserved(struct ttm_buffer_object *bo, bool interruptible)
+static int ttm_bo_wait_unreserved(struct ttm_buffer_object *bo,
+ bool interruptible)
{
if (interruptible) {
return wait_event_interruptible(bo->event_queue,
@@ -168,7 +169,6 @@ int ttm_bo_wait_unreserved(struct ttm_buffer_object *bo, bool interruptible)
return 0;
}
}
-EXPORT_SYMBOL(ttm_bo_wait_unreserved);
void ttm_bo_add_to_lru(struct ttm_buffer_object *bo)
{
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index 5af71af..0fbd046 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -898,18 +898,6 @@ extern void ttm_bo_unreserve(struct ttm_buffer_object *bo);
*/
extern void ttm_bo_unreserve_locked(struct ttm_buffer_object *bo);
-/**
- * ttm_bo_wait_unreserved
- *
- * @bo: A pointer to a struct ttm_buffer_object.
- *
- * Wait for a struct ttm_buffer_object to become unreserved.
- * This is typically used in the execbuf code to relax cpu-usage when
- * a potential deadlock condition backoff.
- */
-extern int ttm_bo_wait_unreserved(struct ttm_buffer_object *bo,
- bool interruptible);
-
/*
* ttm_bo_util.c
*/
--
1.8.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 0/6] drm/ttm: lru lock atomicity removal
2012-11-30 12:11 [PATCH 0/6] drm/ttm: lru lock atomicity removal Maarten Lankhorst
2012-11-30 12:12 ` [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve Maarten Lankhorst
@ 2012-11-30 15:59 ` Thomas Hellstrom
2012-12-06 1:15 ` Jerome Glisse
2 siblings, 0 replies; 15+ messages in thread
From: Thomas Hellstrom @ 2012-11-30 15:59 UTC (permalink / raw)
To: Maarten Lankhorst; +Cc: dri-devel
On 11/30/2012 01:11 PM, Maarten Lankhorst wrote:
> With all the previous patches there shouldn't be anything lying on the reservations being atomic
> with removal of the bo's from the lru lists any more.
>
> As such we can finally fix the locking primitives and make it act like normal mutex calls.
>
> Patch 1 is the actual removal of the guarantee in ttm_bo_reserve
> patch 2 is a cleanup of ttm_eu_backoff_reservation from removing that guarantee
> patch 3...6 are introducing ttm_bo_reserve_slowpath.
>
> After this series, this should directly map to my proposed extensions to mutex.
>
Maarten,
I'm going on parental leave next week and don't have time to review
these patches in detail for at least three weeks.
In principle I'm OK with the changes, introduced, though, but I guess we
need another reviewer to step up.
/Thomas
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/6] drm/ttm: lru lock atomicity removal
2012-11-30 12:11 [PATCH 0/6] drm/ttm: lru lock atomicity removal Maarten Lankhorst
2012-11-30 12:12 ` [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve Maarten Lankhorst
2012-11-30 15:59 ` [PATCH 0/6] drm/ttm: lru lock atomicity removal Thomas Hellstrom
@ 2012-12-06 1:15 ` Jerome Glisse
2012-12-06 1:38 ` Jerome Glisse
2 siblings, 1 reply; 15+ messages in thread
From: Jerome Glisse @ 2012-12-06 1:15 UTC (permalink / raw)
To: Maarten Lankhorst; +Cc: Thomas Hellstrom, dri-devel
On Fri, Nov 30, 2012 at 7:11 AM, Maarten Lankhorst
<maarten.lankhorst@canonical.com> wrote:
> With all the previous patches there shouldn't be anything lying on the reservations being atomic
> with removal of the bo's from the lru lists any more.
>
> As such we can finally fix the locking primitives and make it act like normal mutex calls.
>
> Patch 1 is the actual removal of the guarantee in ttm_bo_reserve
> patch 2 is a cleanup of ttm_eu_backoff_reservation from removing that guarantee
> patch 3...6 are introducing ttm_bo_reserve_slowpath.
>
> After this series, this should directly map to my proposed extensions to mutex.
For the series :
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Few comments in some patches but nothing important.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve
2012-11-30 12:12 ` [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve Maarten Lankhorst
` (4 preceding siblings ...)
2012-11-30 12:13 ` [PATCH 6/6] drm/ttm: unexport ttm_bo_wait_unreserved Maarten Lankhorst
@ 2012-12-06 1:19 ` Jerome Glisse
2012-12-06 9:55 ` Maarten Lankhorst
5 siblings, 1 reply; 15+ messages in thread
From: Jerome Glisse @ 2012-12-06 1:19 UTC (permalink / raw)
To: Maarten Lankhorst; +Cc: thellstrom, dri-devel
On Fri, Nov 30, 2012 at 7:12 AM, Maarten Lankhorst
<m.b.lankhorst@gmail.com> wrote:
> There should no longer be assumptions that reserve will always succeed
> with the lru lock held, so we can safely break the whole atomic
> reserve/lru thing. As a bonus this fixes most lockdep annotations for
> reservations.
>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> ---
> drivers/gpu/drm/ttm/ttm_bo.c | 54 ++++++++++++++++++++++------------
> drivers/gpu/drm/ttm/ttm_execbuf_util.c | 2 +-
> include/drm/ttm/ttm_bo_driver.h | 19 +++---------
> 3 files changed, 40 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index 9028327..61b5cd0 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -213,14 +213,13 @@ int ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
> return put_count;
> }
>
> -int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
> +int ttm_bo_reserve_nolru(struct ttm_buffer_object *bo,
> bool interruptible,
> bool no_wait, bool use_sequence, uint32_t sequence)
> {
> - struct ttm_bo_global *glob = bo->glob;
> int ret;
>
> - while (unlikely(atomic_read(&bo->reserved) != 0)) {
> + while (unlikely(atomic_xchg(&bo->reserved, 1) != 0)) {
> /**
> * Deadlock avoidance for multi-bo reserving.
> */
Regarding memory barrier discussion we could add a smp_rdmb here or
few line below before the read of sequence for -EAGAIN. But it's not
really important, if a CPU doesn't see new sequence value the worst
case is that the CPU will go to wait again before reaching back this
section and returning EAGAIN. So it's just gonna waste some CPU cycle
i can't see anything bad that could happen.
> @@ -241,26 +240,36 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
> if (no_wait)
> return -EBUSY;
>
> - spin_unlock(&glob->lru_lock);
> ret = ttm_bo_wait_unreserved(bo, interruptible);
> - spin_lock(&glob->lru_lock);
>
> if (unlikely(ret))
> return ret;
> }
>
> - atomic_set(&bo->reserved, 1);
> if (use_sequence) {
> + bool wake_up = false;
> /**
> * Wake up waiters that may need to recheck for deadlock,
> * if we decreased the sequence number.
> */
> if (unlikely((bo->val_seq - sequence < (1 << 31))
> || !bo->seq_valid))
> - wake_up_all(&bo->event_queue);
> + wake_up = true;
>
> + /*
> + * In the worst case with memory ordering these values can be
> + * seen in the wrong order. However since we call wake_up_all
> + * in that case, this will hopefully not pose a problem,
> + * and the worst case would only cause someone to accidentally
> + * hit -EAGAIN in ttm_bo_reserve when they see old value of
> + * val_seq. However this would only happen if seq_valid was
> + * written before val_seq was, and just means some slightly
> + * increased cpu usage
> + */
> bo->val_seq = sequence;
> bo->seq_valid = true;
If we want we could add smp_wrmb here but see above comment on
usefullness of this.
> + if (wake_up)
> + wake_up_all(&bo->event_queue);
> } else {
> bo->seq_valid = false;
> }
> @@ -289,14 +298,14 @@ int ttm_bo_reserve(struct ttm_buffer_object *bo,
> int put_count = 0;
> int ret;
>
> - spin_lock(&glob->lru_lock);
> - ret = ttm_bo_reserve_locked(bo, interruptible, no_wait, use_sequence,
> - sequence);
> - if (likely(ret == 0))
> + ret = ttm_bo_reserve_nolru(bo, interruptible, no_wait, use_sequence,
> + sequence);
> + if (likely(ret == 0)) {
> + spin_lock(&glob->lru_lock);
> put_count = ttm_bo_del_from_lru(bo);
> - spin_unlock(&glob->lru_lock);
> -
> - ttm_bo_list_ref_sub(bo, put_count, true);
> + spin_unlock(&glob->lru_lock);
> + ttm_bo_list_ref_sub(bo, put_count, true);
> + }
>
> return ret;
> }
> @@ -510,7 +519,7 @@ static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
> int ret;
>
> spin_lock(&glob->lru_lock);
> - ret = ttm_bo_reserve_locked(bo, false, true, false, 0);
> + ret = ttm_bo_reserve_nolru(bo, false, true, false, 0);
>
> spin_lock(&bdev->fence_lock);
> (void) ttm_bo_wait(bo, false, false, true);
> @@ -603,7 +612,7 @@ static int ttm_bo_cleanup_refs_and_unlock(struct ttm_buffer_object *bo,
> return ret;
>
> spin_lock(&glob->lru_lock);
> - ret = ttm_bo_reserve_locked(bo, false, true, false, 0);
> + ret = ttm_bo_reserve_nolru(bo, false, true, false, 0);
>
> /*
> * We raced, and lost, someone else holds the reservation now,
> @@ -667,7 +676,14 @@ static int ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all)
> kref_get(&nentry->list_kref);
> }
>
> - ret = ttm_bo_reserve_locked(entry, false, !remove_all, false, 0);
> + ret = ttm_bo_reserve_nolru(entry, false, true, false, 0);
> + if (remove_all && ret) {
> + spin_unlock(&glob->lru_lock);
> + ret = ttm_bo_reserve_nolru(entry, false, false,
> + false, 0);
> + spin_lock(&glob->lru_lock);
> + }
> +
> if (!ret)
> ret = ttm_bo_cleanup_refs_and_unlock(entry, false,
> !remove_all);
> @@ -818,7 +834,7 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
>
> spin_lock(&glob->lru_lock);
> list_for_each_entry(bo, &man->lru, lru) {
> - ret = ttm_bo_reserve_locked(bo, false, true, false, 0);
> + ret = ttm_bo_reserve_nolru(bo, false, true, false, 0);
> if (!ret)
> break;
> }
> @@ -1799,7 +1815,7 @@ static int ttm_bo_swapout(struct ttm_mem_shrink *shrink)
>
> spin_lock(&glob->lru_lock);
> list_for_each_entry(bo, &glob->swap_lru, swap) {
> - ret = ttm_bo_reserve_locked(bo, false, true, false, 0);
> + ret = ttm_bo_reserve_nolru(bo, false, true, false, 0);
> if (!ret)
> break;
> }
> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> index cd9e452..bd37b5c 100644
> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> @@ -153,7 +153,7 @@ retry:
> struct ttm_buffer_object *bo = entry->bo;
>
> retry_this_bo:
> - ret = ttm_bo_reserve_locked(bo, true, true, true, val_seq);
> + ret = ttm_bo_reserve_nolru(bo, true, true, true, val_seq);
> switch (ret) {
> case 0:
> break;
> diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
> index e3a43a4..6fff432 100644
> --- a/include/drm/ttm/ttm_bo_driver.h
> +++ b/include/drm/ttm/ttm_bo_driver.h
> @@ -790,16 +790,7 @@ extern void ttm_mem_io_unlock(struct ttm_mem_type_manager *man);
> * to make room for a buffer already reserved. (Buffers are reserved before
> * they are evicted). The following algorithm prevents such deadlocks from
> * occurring:
> - * 1) Buffers are reserved with the lru spinlock held. Upon successful
> - * reservation they are removed from the lru list. This stops a reserved buffer
> - * from being evicted. However the lru spinlock is released between the time
> - * a buffer is selected for eviction and the time it is reserved.
> - * Therefore a check is made when a buffer is reserved for eviction, that it
> - * is still the first buffer in the lru list, before it is removed from the
> - * list. @check_lru == 1 forces this check. If it fails, the function returns
> - * -EINVAL, and the caller should then choose a new buffer to evict and repeat
> - * the procedure.
> - * 2) Processes attempting to reserve multiple buffers other than for eviction,
> + * Processes attempting to reserve multiple buffers other than for eviction,
> * (typically execbuf), should first obtain a unique 32-bit
> * validation sequence number,
> * and call this function with @use_sequence == 1 and @sequence == the unique
> @@ -832,7 +823,7 @@ extern int ttm_bo_reserve(struct ttm_buffer_object *bo,
>
>
> /**
> - * ttm_bo_reserve_locked:
> + * ttm_bo_reserve_nolru:
> *
> * @bo: A pointer to a struct ttm_buffer_object.
> * @interruptible: Sleep interruptible if waiting.
> @@ -840,9 +831,7 @@ extern int ttm_bo_reserve(struct ttm_buffer_object *bo,
> * @use_sequence: If @bo is already reserved, Only sleep waiting for
> * it to become unreserved if @sequence < (@bo)->sequence.
> *
> - * Must be called with struct ttm_bo_global::lru_lock held,
> - * and will not remove reserved buffers from the lru lists.
> - * The function may release the LRU spinlock if it needs to sleep.
> + * Will not remove reserved buffers from the lru lists.
> * Otherwise identical to ttm_bo_reserve.
> *
> * Returns:
> @@ -855,7 +844,7 @@ extern int ttm_bo_reserve(struct ttm_buffer_object *bo,
> * -EDEADLK: Bo already reserved using @sequence. This error code will only
> * be returned if @use_sequence is set to true.
> */
> -extern int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
> +extern int ttm_bo_reserve_nolru(struct ttm_buffer_object *bo,
> bool interruptible,
> bool no_wait, bool use_sequence,
> uint32_t sequence);
> --
> 1.8.0
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/6] drm/ttm: use ttm_bo_reserve_slowpath_nolru in ttm_eu_reserve_buffers
2012-11-30 12:12 ` [PATCH 4/6] drm/ttm: use ttm_bo_reserve_slowpath_nolru in ttm_eu_reserve_buffers Maarten Lankhorst
@ 2012-12-06 1:36 ` Jerome Glisse
2012-12-06 10:52 ` Maarten Lankhorst
0 siblings, 1 reply; 15+ messages in thread
From: Jerome Glisse @ 2012-12-06 1:36 UTC (permalink / raw)
To: Maarten Lankhorst; +Cc: thellstrom, dri-devel
On Fri, Nov 30, 2012 at 7:12 AM, Maarten Lankhorst
<m.b.lankhorst@gmail.com> wrote:
> This requires re-use of the seqno, which increases fairness slightly.
> Instead of spinning with a new seqno every time we keep the current one,
> but still drop all other reservations we hold. Only when we succeed,
> we try to get back our other reservations again.
>
> This should increase fairness slightly as well.
>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> ---
> drivers/gpu/drm/ttm/ttm_execbuf_util.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> index c7d3236..c02b2b6 100644
> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
> @@ -129,13 +129,17 @@ int ttm_eu_reserve_buffers(struct list_head *list)
> entry = list_first_entry(list, struct ttm_validate_buffer, head);
> glob = entry->bo->glob;
>
> -retry:
> spin_lock(&glob->lru_lock);
> val_seq = entry->bo->bdev->val_seq++;
>
> +retry:
After more thinking while i was going over patches to send comment i
think this one is bad, really bad. So here is bad case i see, we have
a list of 2 bo A & B. We succeed reserving A we set seq_valid to true
and val_seq to our val_seq. We fail on B, we go slow path and
unreserve A. Some other CPU reserve A but before it write either
seq_valid or val_seq we wakeup reserve B and retry to reserve A in
reserve_nolru we see A as reserved we enter the while loop see
seq_valid set and test val_seq against our same old val_seq we got ==
and return -EDEADLCK which is definitly not what we want to do. If we
inc val seq on retry we will never have this case. So what exactly not
incrementing it gave us ?
I think the read and write memory barrier i was talking in patch 1
should avoid any such things to happen but i need to sleep on that to
make nightmare about all things that can go wrong.
> list_for_each_entry(entry, list, head) {
> struct ttm_buffer_object *bo = entry->bo;
>
> + /* already slowpath reserved? */
> + if (entry->reserved)
> + continue;
> +
> ret = ttm_bo_reserve_nolru(bo, true, true, true, val_seq);
> switch (ret) {
> case 0:
> @@ -157,9 +161,15 @@ retry:
> ttm_eu_backoff_reservation_locked(list);
> spin_unlock(&glob->lru_lock);
> ttm_eu_list_ref_sub(list);
> - ret = ttm_bo_wait_unreserved(bo, true);
> + ret = ttm_bo_reserve_slowpath_nolru(bo, true, val_seq);
> if (unlikely(ret != 0))
> return ret;
> + spin_lock(&glob->lru_lock);
> + entry->reserved = true;
> + if (unlikely(atomic_read(&bo->cpu_writers) > 0)) {
> + ret = -EBUSY;
> + goto err;
> + }
> goto retry;
> default:
> goto err;
> --
> 1.8.0
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/6] drm/ttm: lru lock atomicity removal
2012-12-06 1:15 ` Jerome Glisse
@ 2012-12-06 1:38 ` Jerome Glisse
0 siblings, 0 replies; 15+ messages in thread
From: Jerome Glisse @ 2012-12-06 1:38 UTC (permalink / raw)
To: Maarten Lankhorst; +Cc: Thomas Hellstrom, dri-devel
On Wed, Dec 5, 2012 at 8:15 PM, Jerome Glisse <j.glisse@gmail.com> wrote:
> On Fri, Nov 30, 2012 at 7:11 AM, Maarten Lankhorst
> <maarten.lankhorst@canonical.com> wrote:
>> With all the previous patches there shouldn't be anything lying on the reservations being atomic
>> with removal of the bo's from the lru lists any more.
>>
>> As such we can finally fix the locking primitives and make it act like normal mutex calls.
>>
>> Patch 1 is the actual removal of the guarantee in ttm_bo_reserve
>> patch 2 is a cleanup of ttm_eu_backoff_reservation from removing that guarantee
>> patch 3...6 are introducing ttm_bo_reserve_slowpath.
>>
>> After this series, this should directly map to my proposed extensions to mutex.
>
> For the series :
> Reviewed-by: Jerome Glisse <jglisse@redhat.com>
>
> Few comments in some patches but nothing important.
Actually while about to send my comments i thought of some bad
scenario see patch 4.
Cheers,
Jerome
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve
2012-12-06 1:19 ` [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve Jerome Glisse
@ 2012-12-06 9:55 ` Maarten Lankhorst
0 siblings, 0 replies; 15+ messages in thread
From: Maarten Lankhorst @ 2012-12-06 9:55 UTC (permalink / raw)
To: Jerome Glisse; +Cc: Maarten Lankhorst, thellstrom, dri-devel
Op 06-12-12 02:19, Jerome Glisse schreef:
> On Fri, Nov 30, 2012 at 7:12 AM, Maarten Lankhorst
> <m.b.lankhorst@gmail.com> wrote:
>> There should no longer be assumptions that reserve will always succeed
>> with the lru lock held, so we can safely break the whole atomic
>> reserve/lru thing. As a bonus this fixes most lockdep annotations for
>> reservations.
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>> ---
>> drivers/gpu/drm/ttm/ttm_bo.c | 54 ++++++++++++++++++++++------------
>> drivers/gpu/drm/ttm/ttm_execbuf_util.c | 2 +-
>> include/drm/ttm/ttm_bo_driver.h | 19 +++---------
>> 3 files changed, 40 insertions(+), 35 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index 9028327..61b5cd0 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -213,14 +213,13 @@ int ttm_bo_del_from_lru(struct ttm_buffer_object *bo)
>> return put_count;
>> }
>>
>> -int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
>> +int ttm_bo_reserve_nolru(struct ttm_buffer_object *bo,
>> bool interruptible,
>> bool no_wait, bool use_sequence, uint32_t sequence)
>> {
>> - struct ttm_bo_global *glob = bo->glob;
>> int ret;
>>
>> - while (unlikely(atomic_read(&bo->reserved) != 0)) {
>> + while (unlikely(atomic_xchg(&bo->reserved, 1) != 0)) {
>> /**
>> * Deadlock avoidance for multi-bo reserving.
>> */
> Regarding memory barrier discussion we could add a smp_rdmb here or
> few line below before the read of sequence for -EAGAIN. But it's not
> really important, if a CPU doesn't see new sequence value the worst
> case is that the CPU will go to wait again before reaching back this
> section and returning EAGAIN. So it's just gonna waste some CPU cycle
> i can't see anything bad that could happen.
Ah indeed, no bad thing can happen from what I can tell.
I looked at Documentation/atomic_ops.txt again, and it says:
"If a caller requires memory barrier semantics around an atomic_t
operation which does not return a value, opsa set of interfaces are
defined which accomplish this:"
Since we check the value of atomic_xchg, I think that means memory barriers are implied,
strengthened by the fact that the arch specific xchg implementations I checked (x86 and sparc) clobber
memory, which definitely implies a compiler barrier at least, that should be good enough here. Parisc
has no native xchg op and falls back to using spin_lock_irqrestore, so it would definitely be good enough
there.
>> @@ -241,26 +240,36 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
>> if (no_wait)
>> return -EBUSY;
>>
>> - spin_unlock(&glob->lru_lock);
>> ret = ttm_bo_wait_unreserved(bo, interruptible);
>> - spin_lock(&glob->lru_lock);
>>
>> if (unlikely(ret))
>> return ret;
>> }
>>
>> - atomic_set(&bo->reserved, 1);
>> if (use_sequence) {
>> + bool wake_up = false;
>> /**
>> * Wake up waiters that may need to recheck for deadlock,
>> * if we decreased the sequence number.
>> */
>> if (unlikely((bo->val_seq - sequence < (1 << 31))
>> || !bo->seq_valid))
>> - wake_up_all(&bo->event_queue);
>> + wake_up = true;
>>
>> + /*
>> + * In the worst case with memory ordering these values can be
>> + * seen in the wrong order. However since we call wake_up_all
>> + * in that case, this will hopefully not pose a problem,
>> + * and the worst case would only cause someone to accidentally
>> + * hit -EAGAIN in ttm_bo_reserve when they see old value of
>> + * val_seq. However this would only happen if seq_valid was
>> + * written before val_seq was, and just means some slightly
>> + * increased cpu usage
>> + */
>> bo->val_seq = sequence;
>> bo->seq_valid = true;
> If we want we could add smp_wrmb here but see above comment on
> usefullness of this.
Yeah would be overkill. The wake_up_all takes an irqoff spinlock, which would implicitly
count as full memory barrier, and the lru lock will always be taken afterwards, which is
definitely a full memory barrier.
~Maarten
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/6] drm/ttm: use ttm_bo_reserve_slowpath_nolru in ttm_eu_reserve_buffers
2012-12-06 1:36 ` Jerome Glisse
@ 2012-12-06 10:52 ` Maarten Lankhorst
2012-12-06 15:38 ` Jerome Glisse
0 siblings, 1 reply; 15+ messages in thread
From: Maarten Lankhorst @ 2012-12-06 10:52 UTC (permalink / raw)
To: Jerome Glisse; +Cc: thellstrom, dri-devel
Op 06-12-12 02:36, Jerome Glisse schreef:
> On Fri, Nov 30, 2012 at 7:12 AM, Maarten Lankhorst
> <m.b.lankhorst@gmail.com> wrote:
>> This requires re-use of the seqno, which increases fairness slightly.
>> Instead of spinning with a new seqno every time we keep the current one,
>> but still drop all other reservations we hold. Only when we succeed,
>> we try to get back our other reservations again.
>>
>> This should increase fairness slightly as well.
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>> ---
>> drivers/gpu/drm/ttm/ttm_execbuf_util.c | 14 ++++++++++++--
>> 1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>> index c7d3236..c02b2b6 100644
>> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>> @@ -129,13 +129,17 @@ int ttm_eu_reserve_buffers(struct list_head *list)
>> entry = list_first_entry(list, struct ttm_validate_buffer, head);
>> glob = entry->bo->glob;
>>
>> -retry:
>> spin_lock(&glob->lru_lock);
>> val_seq = entry->bo->bdev->val_seq++;
>>
>> +retry:
> After more thinking while i was going over patches to send comment i
> think this one is bad, really bad. So here is bad case i see, we have
> a list of 2 bo A & B. We succeed reserving A we set seq_valid to true
> and val_seq to our val_seq. We fail on B, we go slow path and
> unreserve A. Some other CPU reserve A but before it write either
> seq_valid or val_seq we wakeup reserve B and retry to reserve A in
> reserve_nolru we see A as reserved we enter the while loop see
> seq_valid set and test val_seq against our same old val_seq we got ==
> and return -EDEADLCK which is definitly not what we want to do. If we
> inc val seq on retry we will never have this case. So what exactly not
> incrementing it gave us ?
Hm, that would indeed be a valid problem.
Annoyingly, I can't do the conversion to mutexes until the end of the patchset.
I solved this in the ticket mutexes by making the seqno an atomic, which is unset before unlocking.
But doing this with the current ttm code would force extra wake_up_all's inside the reserve path,
since we lack the slowpath handling mutexes have.
Nouveau would run into the same problems already with patch 1, so perhaps we could disable
the -EDEADLK handling and sleep in that case for now? Afaict, it's a driver error if -EDEADLK
occurs.
If performance is unimportant unset seq_valid before unreserving,
and add a smp_wmb between seqno and seqno_valid assignments.
wake_up_all() would be called unconditionally during reservation then, but that call is expensive..
> I think the read and write memory barrier i was talking in patch 1
> should avoid any such things to happen but i need to sleep on that to
> make nightmare about all things that can go wrong.
No it wouldn't help, still a race immediately after the xchg call. :-(
~Maarten
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 4/6] drm/ttm: use ttm_bo_reserve_slowpath_nolru in ttm_eu_reserve_buffers
2012-12-06 10:52 ` Maarten Lankhorst
@ 2012-12-06 15:38 ` Jerome Glisse
0 siblings, 0 replies; 15+ messages in thread
From: Jerome Glisse @ 2012-12-06 15:38 UTC (permalink / raw)
To: Maarten Lankhorst; +Cc: thellstrom, dri-devel
On Thu, Dec 6, 2012 at 5:52 AM, Maarten Lankhorst
<maarten.lankhorst@canonical.com> wrote:
> Op 06-12-12 02:36, Jerome Glisse schreef:
>> On Fri, Nov 30, 2012 at 7:12 AM, Maarten Lankhorst
>> <m.b.lankhorst@gmail.com> wrote:
>>> This requires re-use of the seqno, which increases fairness slightly.
>>> Instead of spinning with a new seqno every time we keep the current one,
>>> but still drop all other reservations we hold. Only when we succeed,
>>> we try to get back our other reservations again.
>>>
>>> This should increase fairness slightly as well.
>>>
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>> ---
>>> drivers/gpu/drm/ttm/ttm_execbuf_util.c | 14 ++++++++++++--
>>> 1 file changed, 12 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>>> index c7d3236..c02b2b6 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
>>> @@ -129,13 +129,17 @@ int ttm_eu_reserve_buffers(struct list_head *list)
>>> entry = list_first_entry(list, struct ttm_validate_buffer, head);
>>> glob = entry->bo->glob;
>>>
>>> -retry:
>>> spin_lock(&glob->lru_lock);
>>> val_seq = entry->bo->bdev->val_seq++;
>>>
>>> +retry:
>> After more thinking while i was going over patches to send comment i
>> think this one is bad, really bad. So here is bad case i see, we have
>> a list of 2 bo A & B. We succeed reserving A we set seq_valid to true
>> and val_seq to our val_seq. We fail on B, we go slow path and
>> unreserve A. Some other CPU reserve A but before it write either
>> seq_valid or val_seq we wakeup reserve B and retry to reserve A in
>> reserve_nolru we see A as reserved we enter the while loop see
>> seq_valid set and test val_seq against our same old val_seq we got ==
>> and return -EDEADLCK which is definitly not what we want to do. If we
>> inc val seq on retry we will never have this case. So what exactly not
>> incrementing it gave us ?
> Hm, that would indeed be a valid problem.
>
> Annoyingly, I can't do the conversion to mutexes until the end of the patchset.
> I solved this in the ticket mutexes by making the seqno an atomic, which is unset before unlocking.
> But doing this with the current ttm code would force extra wake_up_all's inside the reserve path,
> since we lack the slowpath handling mutexes have.
>
> Nouveau would run into the same problems already with patch 1, so perhaps we could disable
> the -EDEADLK handling and sleep in that case for now? Afaict, it's a driver error if -EDEADLK
> occurs.
>
> If performance is unimportant unset seq_valid before unreserving,
> and add a smp_wmb between seqno and seqno_valid assignments.
> wake_up_all() would be called unconditionally during reservation then, but that call is expensive..
>
>> I think the read and write memory barrier i was talking in patch 1
>> should avoid any such things to happen but i need to sleep on that to
>> make nightmare about all things that can go wrong.
> No it wouldn't help, still a race immediately after the xchg call. :-(
>
> ~Maarten
Then just keep the retry label where it's, it doesn't change the
fundamental of that patch and it avoids the EDEADLK scenario.
Cheers,
Jerome
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-12-06 15:38 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-30 12:11 [PATCH 0/6] drm/ttm: lru lock atomicity removal Maarten Lankhorst
2012-11-30 12:12 ` [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve Maarten Lankhorst
2012-11-30 12:12 ` [PATCH 2/6] drm/ttm: cleanup ttm_eu_reserve_buffers handling Maarten Lankhorst
2012-11-30 12:12 ` [PATCH 3/6] drm/ttm: add ttm_bo_reserve_slowpath Maarten Lankhorst
2012-11-30 12:12 ` [PATCH 4/6] drm/ttm: use ttm_bo_reserve_slowpath_nolru in ttm_eu_reserve_buffers Maarten Lankhorst
2012-12-06 1:36 ` Jerome Glisse
2012-12-06 10:52 ` Maarten Lankhorst
2012-12-06 15:38 ` Jerome Glisse
2012-11-30 12:12 ` [PATCH 5/6] drm/nouveau: use ttm_bo_reserve_slowpath in validate_init Maarten Lankhorst
2012-11-30 12:13 ` [PATCH 6/6] drm/ttm: unexport ttm_bo_wait_unreserved Maarten Lankhorst
2012-12-06 1:19 ` [PATCH 1/6] drm/ttm: remove lru_lock around ttm_bo_reserve Jerome Glisse
2012-12-06 9:55 ` Maarten Lankhorst
2012-11-30 15:59 ` [PATCH 0/6] drm/ttm: lru lock atomicity removal Thomas Hellstrom
2012-12-06 1:15 ` Jerome Glisse
2012-12-06 1:38 ` Jerome Glisse
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.