All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Hellstrom <thellstrom-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
To: Thomas Hellstrom <thomas-4+hqylr40dJg9hUCZPvPmw@public.gmane.org>
Cc: "airlied-cv59FeDIM0c@public.gmane.org"
	<airlied-cv59FeDIM0c@public.gmane.org>,
	"nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
	<nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>,
	Luca Barbieri
	<luca-Ukmtq+NC3rhBHFWNQifrYwC/G2K4zDHf@public.gmane.org>,
	"dri-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org"
	<dri-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>
Subject: Re: [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete
Date: Wed, 20 Jan 2010 13:16:01 +0100	[thread overview]
Message-ID: <4B56F401.8070700@vmware.com> (raw)
In-Reply-To: <4B56F308.5090603-4+hqylr40dJg9hUCZPvPmw@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 510 bytes --]

Thomas Hellstrom wrote:
> Thomas Hellstrom wrote:
>   
>> Yes, it looks correct. Although it seems a little unintuitive to enter 
>> the loop with the spinlock held, and exit it with the spinlock not held. 
>> I've attached yet another patch to have that fixed. Could you take a 
>> look at whether it seems OK with you and, in that case, repost it for Dave?
>>
>>   
>>     
> ... And the patch.
>
> /Thomas
>
>   
Hmm, It seems I should've stayed in bed today. Hopefully this is the 
right one...

/Thomas



[-- Attachment #2: 0001-drm-ttm-Fix-race-condition-in-ttm_bo_delayed_delete.patch --]
[-- Type: text/x-patch, Size: 3514 bytes --]

From b3dc72cf74d1b8e0eb5f5423fbb0ac975f147f4c Mon Sep 17 00:00:00 2001
From: Luca Barbieri <luca-Ukmtq+NC3rhBHFWNQifrYwC/G2K4zDHf@public.gmane.org>
Date: Mon, 18 Jan 2010 19:34:53 +0100
Subject: [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete (v2)

ttm_bo_delayed_delete has a race condition, because after we do:
kref_put(&nentry->list_kref, ttm_bo_release_list);

we are not holding the list lock and not holding any reference to
objects, and thus every bo in the list can be removed and freed at
this point.

However, we then use the next pointer we stored, which is not guaranteed
to be valid.

This was apparently the cause of some Nouveau oopses I experienced.

This patch rewrites the function so that it keeps the reference to nentry
until nentry itself is freed and we already got a reference to nentry->next.

It should now be correct and free of races, but please double check this.

Updated according to Thomas Hellstrom's feedback.

Signed-off-by: Luca Barbieri <luca-Ukmtq+NC3rhBHFWNQifrYwC/G2K4zDHf@public.gmane.org>
Signed-off-by: Thomas Hellstrom <thellstrom-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/ttm/ttm_bo.c |   58 ++++++++++++++++++------------------------
 1 files changed, 25 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index c7733c3..1a3e909 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -524,52 +524,44 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo, bool remove_all)
 static int ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all)
 {
 	struct ttm_bo_global *glob = bdev->glob;
-	struct ttm_buffer_object *entry, *nentry;
-	struct list_head *list, *next;
-	int ret;
+	struct ttm_buffer_object *entry = NULL;
+	int ret = 0;
 
 	spin_lock(&glob->lru_lock);
-	list_for_each_safe(list, next, &bdev->ddestroy) {
-		entry = list_entry(list, struct ttm_buffer_object, ddestroy);
-		nentry = NULL;
+	if (list_empty(&bdev->ddestroy))
+		goto out_unlock;
 
-		/*
-		 * Protect the next list entry from destruction while we
-		 * unlock the lru_lock.
-		 */
+	entry = list_first_entry(&bdev->ddestroy,
+		struct ttm_buffer_object, ddestroy);
+	kref_get(&entry->list_kref);
+
+	for (;;) {
+		struct ttm_buffer_object *nentry = NULL;
 
-		if (next != &bdev->ddestroy) {
-			nentry = list_entry(next, struct ttm_buffer_object,
-					    ddestroy);
+		if (entry->ddestroy.next != &bdev->ddestroy) {
+			nentry = list_first_entry(&entry->ddestroy,
+				struct ttm_buffer_object, ddestroy);
 			kref_get(&nentry->list_kref);
 		}
-		kref_get(&entry->list_kref);
 
 		spin_unlock(&glob->lru_lock);
 		ret = ttm_bo_cleanup_refs(entry, remove_all);
 		kref_put(&entry->list_kref, ttm_bo_release_list);
+		entry = nentry;
+
+		if (ret || !entry)
+			goto out;
 
 		spin_lock(&glob->lru_lock);
-		if (nentry) {
-			bool next_onlist = !list_empty(next);
-			spin_unlock(&glob->lru_lock);
-			kref_put(&nentry->list_kref, ttm_bo_release_list);
-			spin_lock(&glob->lru_lock);
-			/*
-			 * Someone might have raced us and removed the
-			 * next entry from the list. We don't bother restarting
-			 * list traversal.
-			 */
-
-			if (!next_onlist)
-				break;
-		}
-		if (ret)
+		if (list_empty(&entry->ddestroy))
 			break;
 	}
-	ret = !list_empty(&bdev->ddestroy);
-	spin_unlock(&glob->lru_lock);
 
+out_unlock:
+	spin_unlock(&glob->lru_lock);
+out:
+	if (entry)
+		kref_put(&entry->list_kref, ttm_bo_release_list);
 	return ret;
 }
 
-- 
1.6.2.5


[-- Attachment #3: Type: text/plain, Size: 181 bytes --]

_______________________________________________
Nouveau mailing list
Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

WARNING: multiple messages have this Message-ID (diff)
From: Thomas Hellstrom <thellstrom-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
To: Thomas Hellstrom <thomas-4+hqylr40dJg9hUCZPvPmw@public.gmane.org>
Cc: "airlied-cv59FeDIM0c@public.gmane.org"
	<airlied-cv59FeDIM0c@public.gmane.org>,
	"nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
	<nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>,
	Luca Barbieri
	<luca-Ukmtq+NC3rhBHFWNQifrYwC/G2K4zDHf@public.gmane.org>,
	"dri-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org"
	<dri-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>
Subject: Re: [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete
Date: Wed, 20 Jan 2010 13:16:01 +0100	[thread overview]
Message-ID: <4B56F401.8070700@vmware.com> (raw)
In-Reply-To: <4B56F308.5090603-4+hqylr40dJg9hUCZPvPmw@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 510 bytes --]

Thomas Hellstrom wrote:
> Thomas Hellstrom wrote:
>   
>> Yes, it looks correct. Although it seems a little unintuitive to enter 
>> the loop with the spinlock held, and exit it with the spinlock not held. 
>> I've attached yet another patch to have that fixed. Could you take a 
>> look at whether it seems OK with you and, in that case, repost it for Dave?
>>
>>   
>>     
> ... And the patch.
>
> /Thomas
>
>   
Hmm, It seems I should've stayed in bed today. Hopefully this is the 
right one...

/Thomas



[-- Attachment #2: 0001-drm-ttm-Fix-race-condition-in-ttm_bo_delayed_delete.patch --]
[-- Type: text/x-patch, Size: 3515 bytes --]

>From b3dc72cf74d1b8e0eb5f5423fbb0ac975f147f4c Mon Sep 17 00:00:00 2001
From: Luca Barbieri <luca-Ukmtq+NC3rhBHFWNQifrYwC/G2K4zDHf@public.gmane.org>
Date: Mon, 18 Jan 2010 19:34:53 +0100
Subject: [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete (v2)

ttm_bo_delayed_delete has a race condition, because after we do:
kref_put(&nentry->list_kref, ttm_bo_release_list);

we are not holding the list lock and not holding any reference to
objects, and thus every bo in the list can be removed and freed at
this point.

However, we then use the next pointer we stored, which is not guaranteed
to be valid.

This was apparently the cause of some Nouveau oopses I experienced.

This patch rewrites the function so that it keeps the reference to nentry
until nentry itself is freed and we already got a reference to nentry->next.

It should now be correct and free of races, but please double check this.

Updated according to Thomas Hellstrom's feedback.

Signed-off-by: Luca Barbieri <luca-Ukmtq+NC3rhBHFWNQifrYwC/G2K4zDHf@public.gmane.org>
Signed-off-by: Thomas Hellstrom <thellstrom-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
---
 drivers/gpu/drm/ttm/ttm_bo.c |   58 ++++++++++++++++++------------------------
 1 files changed, 25 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index c7733c3..1a3e909 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -524,52 +524,44 @@ static int ttm_bo_cleanup_refs(struct ttm_buffer_object *bo, bool remove_all)
 static int ttm_bo_delayed_delete(struct ttm_bo_device *bdev, bool remove_all)
 {
 	struct ttm_bo_global *glob = bdev->glob;
-	struct ttm_buffer_object *entry, *nentry;
-	struct list_head *list, *next;
-	int ret;
+	struct ttm_buffer_object *entry = NULL;
+	int ret = 0;
 
 	spin_lock(&glob->lru_lock);
-	list_for_each_safe(list, next, &bdev->ddestroy) {
-		entry = list_entry(list, struct ttm_buffer_object, ddestroy);
-		nentry = NULL;
+	if (list_empty(&bdev->ddestroy))
+		goto out_unlock;
 
-		/*
-		 * Protect the next list entry from destruction while we
-		 * unlock the lru_lock.
-		 */
+	entry = list_first_entry(&bdev->ddestroy,
+		struct ttm_buffer_object, ddestroy);
+	kref_get(&entry->list_kref);
+
+	for (;;) {
+		struct ttm_buffer_object *nentry = NULL;
 
-		if (next != &bdev->ddestroy) {
-			nentry = list_entry(next, struct ttm_buffer_object,
-					    ddestroy);
+		if (entry->ddestroy.next != &bdev->ddestroy) {
+			nentry = list_first_entry(&entry->ddestroy,
+				struct ttm_buffer_object, ddestroy);
 			kref_get(&nentry->list_kref);
 		}
-		kref_get(&entry->list_kref);
 
 		spin_unlock(&glob->lru_lock);
 		ret = ttm_bo_cleanup_refs(entry, remove_all);
 		kref_put(&entry->list_kref, ttm_bo_release_list);
+		entry = nentry;
+
+		if (ret || !entry)
+			goto out;
 
 		spin_lock(&glob->lru_lock);
-		if (nentry) {
-			bool next_onlist = !list_empty(next);
-			spin_unlock(&glob->lru_lock);
-			kref_put(&nentry->list_kref, ttm_bo_release_list);
-			spin_lock(&glob->lru_lock);
-			/*
-			 * Someone might have raced us and removed the
-			 * next entry from the list. We don't bother restarting
-			 * list traversal.
-			 */
-
-			if (!next_onlist)
-				break;
-		}
-		if (ret)
+		if (list_empty(&entry->ddestroy))
 			break;
 	}
-	ret = !list_empty(&bdev->ddestroy);
-	spin_unlock(&glob->lru_lock);
 
+out_unlock:
+	spin_unlock(&glob->lru_lock);
+out:
+	if (entry)
+		kref_put(&entry->list_kref, ttm_bo_release_list);
 	return ret;
 }
 
-- 
1.6.2.5


[-- Attachment #3: Type: text/plain, Size: 181 bytes --]

_______________________________________________
Nouveau mailing list
Nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

  parent reply	other threads:[~2010-01-20 12:16 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-18 18:47 [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete Luca Barbieri
2010-01-18 19:40 ` Thomas Hellstrom
     [not found]   ` <4B54B949.9010906-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
2010-01-18 22:33     ` Luca Barbieri
     [not found]       ` <ff13bc9a1001181433v1694e681l9f7ce9d880132dc3-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-20 11:28         ` Thomas Hellstrom
     [not found]           ` <4B56E8EE.8090706-4+hqylr40dJg9hUCZPvPmw@public.gmane.org>
2010-01-20 12:11             ` Thomas Hellstrom
2010-01-20 12:11               ` Thomas Hellstrom
     [not found]               ` <4B56F308.5090603-4+hqylr40dJg9hUCZPvPmw@public.gmane.org>
2010-01-20 12:16                 ` Thomas Hellstrom [this message]
2010-01-20 12:16                   ` Thomas Hellstrom
     [not found]                   ` <4B56F401.8070700-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
2010-01-20 19:22                     ` Luca Barbieri
     [not found]                       ` <ff13bc9a1001201122y110fb003k704bc6d05d2aea07-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-20 20:16                         ` Thomas Hellstrom
     [not found]                           ` <4B5764BA.7080801-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
2010-01-20 20:45                             ` Luca Barbieri
     [not found]                               ` <ff13bc9a1001201245g6ee25219q851b7989968f4c7b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-20 20:58                                 ` Luca Barbieri
     [not found]                                   ` <ff13bc9a1001201258i37ee7354gb305aa98afae3716-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-20 21:11                                     ` Thomas Hellstrom
2010-01-20 21:04                                 ` Thomas Hellstrom
     [not found]                                   ` <4B576FF5.9040907-4+hqylr40dJg9hUCZPvPmw@public.gmane.org>
2010-01-21  3:49                                     ` Luca Barbieri
     [not found]                                       ` <ff13bc9a1001201949l4691f202v2a2874b9cef86f37-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-21 12:29                                         ` Jerome Glisse
     [not found]                                           ` <20100121122920.GB3837-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-01-21 12:59                                             ` Thomas Hellstrom
     [not found]                                               ` <4B584FAE.8040801-4+hqylr40dJg9hUCZPvPmw@public.gmane.org>
2010-01-25  8:14                                                 ` Jerome Glisse
     [not found]                                                   ` <20100125081444.GA23124-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2010-01-25 20:51                                                     ` Thomas Hellstrom
2010-01-21 15:14                                             ` Luca Barbieri
     [not found]                                               ` <ff13bc9a1001210714m34c3976etc7680f056cb55453-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-25  8:20                                                 ` Jerome Glisse
2010-01-21 12:53                                         ` Thomas Hellstrom
     [not found]                                           ` <4B584E48.8020806-4+hqylr40dJg9hUCZPvPmw@public.gmane.org>
2010-01-21 13:40                                             ` Luca Barbieri
     [not found]                                               ` <ff13bc9a1001210540t7dc2b7a7p38359ca82b8b3eb4-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-21 14:07                                                 ` Francisco Jerez
     [not found]                                                   ` <87vdeveekk.fsf-sGOZH3hwPm2sTnJN9+BGXg@public.gmane.org>
2010-01-21 14:17                                                     ` Luca Barbieri
     [not found]                                                       ` <ff13bc9a1001210617qe37ab7at949d545216693608-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-21 14:44                                                         ` Francisco Jerez
     [not found]                                                           ` <87pr53ecwd.fsf-sGOZH3hwPm2sTnJN9+BGXg@public.gmane.org>
2010-01-21 15:36                                                             ` Luca Barbieri
     [not found]                                                               ` <ff13bc9a1001210736k71740417r7b129c70374fece3-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-21 16:18                                                                 ` Francisco Jerez
     [not found]                                                                   ` <87y6jrbfef.fsf-sGOZH3hwPm2sTnJN9+BGXg@public.gmane.org>
2010-01-21 16:30                                                                     ` Luca Barbieri
     [not found]                                                                       ` <ff13bc9a1001210830g2653f672r918f8cb90cbf6170-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-21 17:39                                                                         ` Francisco Jerez
2010-01-21 15:39                                                             ` Maarten Maathuis
     [not found]                                                               ` <6d4bc9fc1001210739r2bb8b4c4i18590376a1628d82-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-21 15:56                                                                 ` Luca Barbieri
     [not found]                                                                   ` <ff13bc9a1001210756r1e627146w89fb2138ca77e6b5-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-21 16:02                                                                     ` Maarten Maathuis
2010-01-21 14:23                                                 ` Thomas Hellstrom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B56F401.8070700@vmware.com \
    --to=thellstrom-pghwnbhtmq7qt0dzr+alfa@public.gmane.org \
    --cc=airlied-cv59FeDIM0c@public.gmane.org \
    --cc=dri-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    --cc=luca-Ukmtq+NC3rhBHFWNQifrYwC/G2K4zDHf@public.gmane.org \
    --cc=nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=thomas-4+hqylr40dJg9hUCZPvPmw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.