[RFC][PATCH] mm: couple rcu and memory reclaim

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC][PATCH] mm: couple rcu and memory reclaim
@ 2007-09-24  8:45 Peter Zijlstra
  2007-09-24 10:42 ` Balbir Singh
  2007-09-28 20:13 ` Nick Piggin
  0 siblings, 2 replies; 8+ messages in thread
From: Peter Zijlstra @ 2007-09-24  8:45 UTC (permalink / raw)
  To: Paul E. McKenney, linux-kernel; +Cc: Nick Piggin, Rik van Riel

Just an idea I had, it seems like a good idea to wait for RCU callbacks
in reclaim so that we won't get all of memory stuck there.

If this location is too aggressive we might stick it next to
disable_swap_token().

---
Couple RCU and reclaim.

There could be a lot of memory stuck in RCU callbacks. Wait for RCU to
finish before giving it another go.

Placed in kswapd and not direct reclaim path because kswapd never holds
rcu_read_lock() at this point and can thus not deadlock. Direct reclaim
callers might hold rcu_read_lock() and would suffer from deadlocks if
sync_rcu() were to be called.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 mm/vmscan.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -1435,8 +1435,10 @@ loop_again:
 		unsigned long lru_pages = 0;

 		/* The swap token gets in the way of swapout... */
-		if (!priority)
+		if (!priority) {
+			synchronize_rcu();
 			disable_swap_token();
+		}

 		all_zones_ok = 1;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] mm: couple rcu and memory reclaim
  2007-09-24  8:45 [RFC][PATCH] mm: couple rcu and memory reclaim Peter Zijlstra
@ 2007-09-24 10:42 ` Balbir Singh
  2007-09-24 11:06   ` Peter Zijlstra
  2007-09-28 20:13 ` Nick Piggin
  1 sibling, 1 reply; 8+ messages in thread
From: Balbir Singh @ 2007-09-24 10:42 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Paul E. McKenney, linux-kernel, Nick Piggin, Rik van Riel

Peter Zijlstra wrote:
> 
> Just an idea I had, it seems like a good idea to wait for RCU callbacks
> in reclaim so that we won't get all of memory stuck there.
> 
> If this location is too aggressive we might stick it next to
> disable_swap_token().
> 
> ---
> Couple RCU and reclaim.
> 
> There could be a lot of memory stuck in RCU callbacks. Wait for RCU to
> finish before giving it another go.
> 
> Placed in kswapd and not direct reclaim path because kswapd never holds
> rcu_read_lock() at this point and can thus not deadlock. Direct reclaim
> callers might hold rcu_read_lock() and would suffer from deadlocks if
> sync_rcu() were to be called.
> 
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
>  mm/vmscan.c |    4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6/mm/vmscan.c
> ===================================================================
> --- linux-2.6.orig/mm/vmscan.c
> +++ linux-2.6/mm/vmscan.c
> @@ -1435,8 +1435,10 @@ loop_again:
>  		unsigned long lru_pages = 0;
> 
>  		/* The swap token gets in the way of swapout... */
> -		if (!priority)
> +		if (!priority) {
> +			synchronize_rcu();

Interesting change

1. I suspect that synchronize_rcu() is most likely to free up
   slab pages, so shrink_slab() will clean up all the freed
   pages. Could we add a comment to indicate the same?
2. Shouldn't we do this in balance_pgdat() as well?


>  			disable_swap_token();
> +		}
> 
>  		all_zones_ok = 1;
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] mm: couple rcu and memory reclaim
  2007-09-24 10:42 ` Balbir Singh
@ 2007-09-24 11:06   ` Peter Zijlstra
  2007-09-24 11:22     ` Balbir Singh
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2007-09-24 11:06 UTC (permalink / raw)
  To: balbir; +Cc: Paul E. McKenney, linux-kernel, Nick Piggin, Rik van Riel

On Mon, 24 Sep 2007 16:12:19 +0530 Balbir Singh
<balbir@linux.vnet.ibm.com> wrote:

> Peter Zijlstra wrote:
> > 
> > Just an idea I had, it seems like a good idea to wait for RCU callbacks
> > in reclaim so that we won't get all of memory stuck there.
> > 
> > If this location is too aggressive we might stick it next to
> > disable_swap_token().
> > 
> > ---
> > Couple RCU and reclaim.
> > 
> > There could be a lot of memory stuck in RCU callbacks. Wait for RCU to
> > finish before giving it another go.
> > 
> > Placed in kswapd and not direct reclaim path because kswapd never holds
> > rcu_read_lock() at this point and can thus not deadlock. Direct reclaim
> > callers might hold rcu_read_lock() and would suffer from deadlocks if
> > sync_rcu() were to be called.
> > 
> > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > ---
> >  mm/vmscan.c |    4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > Index: linux-2.6/mm/vmscan.c
> > ===================================================================
> > --- linux-2.6.orig/mm/vmscan.c
> > +++ linux-2.6/mm/vmscan.c
> > @@ -1435,8 +1435,10 @@ loop_again:
> >  		unsigned long lru_pages = 0;
> > 
> >  		/* The swap token gets in the way of swapout... */
> > -		if (!priority)
> > +		if (!priority) {
> > +			synchronize_rcu();

Bah, it seems I send the wrong patch out :-/

this is the one against disable_swap_token(). I meant to send out this
one:

@@ -1527,8 +1527,10 @@ loop_again:
                 * OK, kswapd is getting into trouble.  Take a nap, then take
                 * another pass across the zones.
                 */
-               if (total_scanned && priority < DEF_PRIORITY - 2)
+               if (total_scanned && priority < DEF_PRIORITY - 2) {
+                       synchronize_rcu();
                        congestion_wait(WRITE, HZ/10);
+               }

 
> Interesting change
> 
> 1. I suspect that synchronize_rcu() is most likely to free up
>    slab pages, so shrink_slab() will clean up all the freed
>    pages. Could we add a comment to indicate the same?

Yes indeed, will add such a comment.

> 2. Shouldn't we do this in balance_pgdat() as well?

Uhm, this is balance_pgdat() (both these changes) :-)

Only kswapd can do this, direct reclaim has deadlock potential.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] mm: couple rcu and memory reclaim
  2007-09-24 11:06   ` Peter Zijlstra
@ 2007-09-24 11:22     ` Balbir Singh
  2007-09-24 11:50       ` Peter Zijlstra
  0 siblings, 1 reply; 8+ messages in thread
From: Balbir Singh @ 2007-09-24 11:22 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Paul E. McKenney, linux-kernel, Nick Piggin, Rik van Riel

Peter Zijlstra wrote:
> On Mon, 24 Sep 2007 16:12:19 +0530 Balbir Singh
> <balbir@linux.vnet.ibm.com> wrote:
> 
>> Peter Zijlstra wrote:
>>> Just an idea I had, it seems like a good idea to wait for RCU callbacks
>>> in reclaim so that we won't get all of memory stuck there.
>>>
>>> If this location is too aggressive we might stick it next to
>>> disable_swap_token().
>>>
>>> ---
>>> Couple RCU and reclaim.
>>>
>>> There could be a lot of memory stuck in RCU callbacks. Wait for RCU to
>>> finish before giving it another go.
>>>
>>> Placed in kswapd and not direct reclaim path because kswapd never holds
>>> rcu_read_lock() at this point and can thus not deadlock. Direct reclaim
>>> callers might hold rcu_read_lock() and would suffer from deadlocks if
>>> sync_rcu() were to be called.
>>>
>>> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
>>> ---
>>>  mm/vmscan.c |    4 +++-
>>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> Index: linux-2.6/mm/vmscan.c
>>> ===================================================================
>>> --- linux-2.6.orig/mm/vmscan.c
>>> +++ linux-2.6/mm/vmscan.c
>>> @@ -1435,8 +1435,10 @@ loop_again:
>>>  		unsigned long lru_pages = 0;
>>>
>>>  		/* The swap token gets in the way of swapout... */
>>> -		if (!priority)
>>> +		if (!priority) {
>>> +			synchronize_rcu();
> 
> Bah, it seems I send the wrong patch out :-/
> 

Looks like I reviewed the wrong thing then :-)

> this is the one against disable_swap_token(). I meant to send out this
> one:
> 
> @@ -1527,8 +1527,10 @@ loop_again:
>                  * OK, kswapd is getting into trouble.  Take a nap, then take
>                  * another pass across the zones.
>                  */
> -               if (total_scanned && priority < DEF_PRIORITY - 2)
> +               if (total_scanned && priority < DEF_PRIORITY - 2) {
> +                       synchronize_rcu();
>                         congestion_wait(WRITE, HZ/10);
> +               }
> 
> 
>> Interesting change
>>
>> 1. I suspect that synchronize_rcu() is most likely to free up
>>    slab pages, so shrink_slab() will clean up all the freed
>>    pages. Could we add a comment to indicate the same?
> 
> Yes indeed, will add such a comment.
> 
>> 2. Shouldn't we do this in balance_pgdat() as well?
> 
> Uhm, this is balance_pgdat() (both these changes) :-)
> 

Hmm.. Could you please generate the diff with -p.

> Only kswapd can do this, direct reclaim has deadlock potential.

Yes, but not in all cases, do you want to add any gfp_mask
based smartness for direct reclaim?


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] mm: couple rcu and memory reclaim
  2007-09-24 11:22     ` Balbir Singh
@ 2007-09-24 11:50       ` Peter Zijlstra
  2007-09-24 12:48         ` Balbir Singh
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2007-09-24 11:50 UTC (permalink / raw)
  To: balbir; +Cc: Paul E. McKenney, linux-kernel, Nick Piggin, Rik van Riel

On Mon, 24 Sep 2007 16:52:15 +0530 Balbir Singh
<balbir@linux.vnet.ibm.com> wrote:

> Peter Zijlstra wrote:
> > On Mon, 24 Sep 2007 16:12:19 +0530 Balbir Singh
> > <balbir@linux.vnet.ibm.com> wrote:
> > 
> >> Peter Zijlstra wrote:
> >>> Just an idea I had, it seems like a good idea to wait for RCU callbacks
> >>> in reclaim so that we won't get all of memory stuck there.
> >>>
> >>> If this location is too aggressive we might stick it next to
> >>> disable_swap_token().
> >>>
> >>> ---
> >>> Couple RCU and reclaim.
> >>>
> >>> There could be a lot of memory stuck in RCU callbacks. Wait for RCU to
> >>> finish before giving it another go.
> >>>
> >>> Placed in kswapd and not direct reclaim path because kswapd never holds
> >>> rcu_read_lock() at this point and can thus not deadlock. Direct reclaim
> >>> callers might hold rcu_read_lock() and would suffer from deadlocks if
> >>> sync_rcu() were to be called.
> >>>
> >>> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> >>> ---

---
 mm/vmscan.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -1527,8 +1527,10 @@ loop_again:
 		 * OK, kswapd is getting into trouble.  Take a nap, then take
 		 * another pass across the zones.
 		 */
-		if (total_scanned && priority < DEF_PRIORITY - 2)
+		if (total_scanned && priority < DEF_PRIORITY - 2) {
+			synchronize_rcu();
 			congestion_wait(WRITE, HZ/10);
+		}
 
 		/*
 		 * We do this so kswapd doesn't build up large priorities for

> > Only kswapd can do this, direct reclaim has deadlock potential.
> 
> Yes, but not in all cases, do you want to add any gfp_mask
> based smartness for direct reclaim?

gfp_mask doesn't carry the needed information. It depends on whether
the current context holds a rcu_read_lock().

so something like:

   rcu_read_lock()
   foo = kmalloc(sizeof(foo))
      new_slab()
        __alloc_pages()
          try_to_free_pages()
            synchronise_rcu() <-- deadlock
   rcu_read_unlock()

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] mm: couple rcu and memory reclaim
  2007-09-24 11:50       ` Peter Zijlstra
@ 2007-09-24 12:48         ` Balbir Singh
  2007-09-24 12:58           ` Peter Zijlstra
  0 siblings, 1 reply; 8+ messages in thread
From: Balbir Singh @ 2007-09-24 12:48 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Paul E. McKenney, linux-kernel, Nick Piggin, Rik van Riel

Peter Zijlstra wrote:
>>> Only kswapd can do this, direct reclaim has deadlock potential.
>> Yes, but not in all cases, do you want to add any gfp_mask
>> based smartness for direct reclaim?
> 
> gfp_mask doesn't carry the needed information. It depends on whether
> the current context holds a rcu_read_lock().
> 

What I meant was that nobody would hold rcu_read_lock() and pass
gfp_mask of GFP_KERNEL in scan_control or to do_try_to_free_pages()

> so something like:
> 
>    rcu_read_lock()
>    foo = kmalloc(sizeof(foo))

At this point, you really can't use GFP_KERNEL, since rcu_read_lock()
disables pre-emption in the current kernel, ideally you should see
a might_sleep() BUG.

>       new_slab()
>         __alloc_pages()
>           try_to_free_pages()
>             synchronise_rcu() <-- deadlock
>    rcu_read_unlock()


-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] mm: couple rcu and memory reclaim
  2007-09-24 12:48         ` Balbir Singh
@ 2007-09-24 12:58           ` Peter Zijlstra
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Zijlstra @ 2007-09-24 12:58 UTC (permalink / raw)
  To: balbir; +Cc: Paul E. McKenney, linux-kernel, Nick Piggin, Rik van Riel

On Mon, 24 Sep 2007 18:18:30 +0530 Balbir Singh
<balbir@linux.vnet.ibm.com> wrote:

> Peter Zijlstra wrote:
> >>> Only kswapd can do this, direct reclaim has deadlock potential.
> >> Yes, but not in all cases, do you want to add any gfp_mask
> >> based smartness for direct reclaim?
> > 
> > gfp_mask doesn't carry the needed information. It depends on whether
> > the current context holds a rcu_read_lock().
> > 
> 
> What I meant was that nobody would hold rcu_read_lock() and pass
> gfp_mask of GFP_KERNEL in scan_control or to do_try_to_free_pages()
> 
> > so something like:
> > 
> >    rcu_read_lock()
> >    foo = kmalloc(sizeof(foo))
> 
> At this point, you really can't use GFP_KERNEL, since rcu_read_lock()
> disables pre-emption in the current kernel, ideally you should see
> a might_sleep() BUG.
> 
> >       new_slab()
> >         __alloc_pages()
> >           try_to_free_pages()
> >             synchronise_rcu() <-- deadlock
> >    rcu_read_unlock()

I guess I've been using preemptibe rcu for way too long. Would need to
clean up some of -rt to make this true there, but should be doable
indeed.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC][PATCH] mm: couple rcu and memory reclaim
  2007-09-24  8:45 [RFC][PATCH] mm: couple rcu and memory reclaim Peter Zijlstra
  2007-09-24 10:42 ` Balbir Singh
@ 2007-09-28 20:13 ` Nick Piggin
  1 sibling, 0 replies; 8+ messages in thread
From: Nick Piggin @ 2007-09-28 20:13 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Paul E. McKenney, linux-kernel, Rik van Riel

On Monday 24 September 2007 18:45, Peter Zijlstra wrote:
> Just an idea I had, it seems like a good idea to wait for RCU callbacks
> in reclaim so that we won't get all of memory stuck there.

I think it would be much too aggressive (_especially_ with preemptible
RCU, I would have thought?). And not just for the case of adding a lot
of idle time while other CPUs are busy not being preempted -- also
that if you have lots of CPUs reclaiming and asking to synchronize_rcu,
then you could have global RCU state getting ping-ponged.

> If this location is too aggressive we might stick it next to
> disable_swap_token().

I don't see a really good reason to add this kind of wait here. I mean,
we have to wait for RCU anyway, so we may as well scan what we can
until then?

Putting it in a slowpath at some point when we think we're heading for
OOM I think would be a bit better.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-09-29 12:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-24  8:45 [RFC][PATCH] mm: couple rcu and memory reclaim Peter Zijlstra
2007-09-24 10:42 ` Balbir Singh
2007-09-24 11:06   ` Peter Zijlstra
2007-09-24 11:22     ` Balbir Singh
2007-09-24 11:50       ` Peter Zijlstra
2007-09-24 12:48         ` Balbir Singh
2007-09-24 12:58           ` Peter Zijlstra
2007-09-28 20:13 ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox