public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fixed a mismatch between the users of radix_tree and the implementation.
@ 2010-08-13  7:15 Salman Qazi
  2010-08-13  7:20 ` Salman Qazi
  0 siblings, 1 reply; 17+ messages in thread
From: Salman Qazi @ 2010-08-13  7:15 UTC (permalink / raw)
  To: npiggin, paulmck, akpm, linux-kernel, a.p.zijlstra; +Cc: adurbin

This matters for the lockless page cache, in particular find_get_pages implementation.

In the following case, we get can get a deadlock:

    0.  The radix tree contains two items, one has the index 0.
    1.  The reader (in this case find_get_pages) takes the rcu_read_lock.
    2.  The reader acquires slot(s) for item(s) including the index 0 item.
    3.  The non-zero index item is deleted, and as a consequence the other item
        is moved to the root of the tree.  The place where it used to be is
        queued for deletion after the readers finish.
    4.  The reader looks at the index 0 slot, and finds that the page has 0 ref count
    5.  The reader looks at it again, hoping that the item will either be freed
        or the ref count will increase.  This never happens, as the slot it
        is looking at will never be updated.  Also, this slot can never be reclaimed
        because the reader is holding rcu_read_lock and is in an infinite loop.

This can be reproduced with reliably by running dbench followed by compilebench under
autotest.  I have not been able to construct a small targeted repro case.

There is also a similar potential issue with insertion.  Storing the first
element in the root and then moving it to a new slot on insertion of a
second element would potentially lead to a similar problem.

Both of these issues have been fixed in this change.

Signed-off-by: Salman Qazi <sqazi@google.com>
---
 lib/radix-tree.c |    9 ++-------
 1 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index e907858..035d5aa 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -252,11 +252,6 @@ static int radix_tree_extend(struct radix_tree_root *root, unsigned long index)
 	while (index > radix_tree_maxindex(height))
 		height++;
 
-	if (root->rnode == NULL) {
-		root->height = height;
-		goto out;
-	}
-
 	do {
 		unsigned int newheight;
 		if (!(node = radix_tree_node_alloc(root)))
@@ -278,7 +273,7 @@ static int radix_tree_extend(struct radix_tree_root *root, unsigned long index)
 		rcu_assign_pointer(root->rnode, node);
 		root->height = newheight;
 	} while (height > root->height);
-out:
+
 	return 0;
 }
 
@@ -1154,7 +1149,7 @@ EXPORT_SYMBOL(radix_tree_gang_lookup_tag_slot);
 static inline void radix_tree_shrink(struct radix_tree_root *root)
 {
 	/* try to shrink tree height */
-	while (root->height > 0) {
+	while (root->height > 1) {
 		struct radix_tree_node *to_free = root->rnode;
 		void *newptr;
 


^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2010-11-11  0:30 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-13  7:15 [PATCH] Fixed a mismatch between the users of radix_tree and the implementation Salman Qazi
2010-08-13  7:20 ` Salman Qazi
2010-08-16 11:25   ` Peter Zijlstra
2010-08-16 18:30     ` Salman Qazi
2010-08-16 19:33       ` Peter Zijlstra
     [not found]         ` <AANLkTindpUo5tPiXVp6wXjOAqczrJvYegrOMBqSjr4_t@mail.gmail.com>
2010-08-16 21:05           ` Salman Qazi
2010-08-16 21:06           ` Peter Zijlstra
2010-08-17  4:35             ` Salman Qazi
2010-08-17  4:45               ` Salman Qazi
2010-08-17  8:42                 ` Peter Zijlstra
2010-08-17 15:23       ` Nick Piggin
2010-08-17 15:48         ` Peter Zijlstra
2010-08-17 19:49           ` Salman Qazi
2010-08-17 21:28         ` Salman Qazi
2010-08-30 23:27           ` Salman Qazi
2010-11-10 19:16             ` Salman Qazi
2010-11-11  0:29               ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox