public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] make idr_remove_all() do removal -before- free_layer()
@ 2009-03-07 22:25 Paul E. McKenney
  2009-03-08 15:33 ` Ingo Molnar
  0 siblings, 1 reply; 3+ messages in thread
From: Paul E. McKenney @ 2009-03-07 22:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: manfred, Nadia.Derbey, miltonm, mingo, akpm, peterz, lnxninja,
	efault, riel

The following patch fixes a problem in the IDR system, where an
idr_remove_all() hands a data element to call_rcu() (via free_layer())
before making that data element inaccessible to new readers.  This is
very bad, and results in readers still having a reference to this data
element at the end of the grace period.  Tests on large machines that
concurrently map and unmap user-space memory within the same multithreaded
process result in crashes within about five minutes.  Applying this
patch increases the kernel's longevity to the three-to-eight-hour range.

There appear to be other similar problems in idr_get_empty_slot() and
sub_remove(), but I fixed the easy one in idr_remove_all() first.  It is
therefore no surprise that failures still occur.

(Yes, and I did look at the relevant patch last year without spotting
this one.  Goes to show the value of testing as well as code review,
I guess...)

Nadia, Manfred, any thoughts?

Located-by: Milton Miller II <miltonm@austin.ibm.com>
Tested-by: Milton Miller II <miltonm@austin.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

diff --git a/lib/idr.c b/lib/idr.c
index c11c576..dab4bca 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -449,6 +449,7 @@ void idr_remove_all(struct idr *idp)
 
 	n = idp->layers * IDR_BITS;
 	p = idp->top;
+	rcu_assign_pointer(idp->top, NULL);
 	max = 1 << n;
 
 	id = 0;
@@ -467,7 +468,6 @@ void idr_remove_all(struct idr *idp)
 			p = *--paa;
 		}
 	}
-	rcu_assign_pointer(idp->top, NULL);
 	idp->layers = 0;
 }
 EXPORT_SYMBOL(idr_remove_all);

----- End forwarded message -----

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] make idr_remove_all() do removal -before- free_layer()
  2009-03-07 22:25 [PATCH] make idr_remove_all() do removal -before- free_layer() Paul E. McKenney
@ 2009-03-08 15:33 ` Ingo Molnar
  2009-03-08 19:20   ` Paul E. McKenney
  0 siblings, 1 reply; 3+ messages in thread
From: Ingo Molnar @ 2009-03-08 15:33 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, manfred, Nadia.Derbey, miltonm, akpm, peterz,
	lnxninja, efault, riel


* Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> The following patch fixes a problem in the IDR system, where 
> an idr_remove_all() hands a data element to call_rcu() (via 
> free_layer()) before making that data element inaccessible to 
> new readers.  This is very bad, and results in readers still 
> having a reference to this data element at the end of the 
> grace period.  Tests on large machines that concurrently map 
> and unmap user-space memory within the same multithreaded 
> process result in crashes within about five minutes.  Applying 
> this patch increases the kernel's longevity to the 
> three-to-eight-hour range.
> 
> There appear to be other similar problems in 
> idr_get_empty_slot() and sub_remove(), but I fixed the easy 
> one in idr_remove_all() first.  It is therefore no surprise 
> that failures still occur.
> 
> (Yes, and I did look at the relevant patch last year without 
> spotting this one.  Goes to show the value of testing as well 
> as code review, I guess...)
> 
> Nadia, Manfred, any thoughts?
> 
> Located-by: Milton Miller II <miltonm@austin.ibm.com>
> Tested-by: Milton Miller II <miltonm@austin.ibm.com>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Hm, looks like something we really want to see fixed in 
2.6.29-final, right?

	Ingo

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] make idr_remove_all() do removal -before- free_layer()
  2009-03-08 15:33 ` Ingo Molnar
@ 2009-03-08 19:20   ` Paul E. McKenney
  0 siblings, 0 replies; 3+ messages in thread
From: Paul E. McKenney @ 2009-03-08 19:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, manfred, Nadia.Derbey, miltonm, akpm, peterz,
	lnxninja, efault, riel

On Sun, Mar 08, 2009 at 04:33:36PM +0100, Ingo Molnar wrote:
> 
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > The following patch fixes a problem in the IDR system, where 
> > an idr_remove_all() hands a data element to call_rcu() (via 
> > free_layer()) before making that data element inaccessible to 
> > new readers.  This is very bad, and results in readers still 
> > having a reference to this data element at the end of the 
> > grace period.  Tests on large machines that concurrently map 
> > and unmap user-space memory within the same multithreaded 
> > process result in crashes within about five minutes.  Applying 
> > this patch increases the kernel's longevity to the 
> > three-to-eight-hour range.
> > 
> > There appear to be other similar problems in 
> > idr_get_empty_slot() and sub_remove(), but I fixed the easy 
> > one in idr_remove_all() first.  It is therefore no surprise 
> > that failures still occur.
> > 
> > (Yes, and I did look at the relevant patch last year without 
> > spotting this one.  Goes to show the value of testing as well 
> > as code review, I guess...)
> > 
> > Nadia, Manfred, any thoughts?
> > 
> > Located-by: Milton Miller II <miltonm@austin.ibm.com>
> > Tested-by: Milton Miller II <miltonm@austin.ibm.com>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> Hm, looks like something we really want to see fixed in 
> 2.6.29-final, right?

This was located in real testing, so I agree that it is pretty high
priority.  So this patch should go into 2.6.29.

The priority of the remaining yet-as-unknown fixes depends on their
complexity and risk.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-03-08 19:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-07 22:25 [PATCH] make idr_remove_all() do removal -before- free_layer() Paul E. McKenney
2009-03-08 15:33 ` Ingo Molnar
2009-03-08 19:20   ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox