* [RFC][PATCH] mm: couple rcu and memory reclaim
@ 2007-09-24 8:45 Peter Zijlstra
2007-09-24 10:42 ` Balbir Singh
2007-09-28 20:13 ` Nick Piggin
0 siblings, 2 replies; 8+ messages in thread
From: Peter Zijlstra @ 2007-09-24 8:45 UTC (permalink / raw)
To: Paul E. McKenney, linux-kernel; +Cc: Nick Piggin, Rik van Riel
Just an idea I had, it seems like a good idea to wait for RCU callbacks
in reclaim so that we won't get all of memory stuck there.
If this location is too aggressive we might stick it next to
disable_swap_token().
---
Couple RCU and reclaim.
There could be a lot of memory stuck in RCU callbacks. Wait for RCU to
finish before giving it another go.
Placed in kswapd and not direct reclaim path because kswapd never holds
rcu_read_lock() at this point and can thus not deadlock. Direct reclaim
callers might hold rcu_read_lock() and would suffer from deadlocks if
sync_rcu() were to be called.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
mm/vmscan.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -1435,8 +1435,10 @@ loop_again:
unsigned long lru_pages = 0;
/* The swap token gets in the way of swapout... */
- if (!priority)
+ if (!priority) {
+ synchronize_rcu();
disable_swap_token();
+ }
all_zones_ok = 1;
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [RFC][PATCH] mm: couple rcu and memory reclaim 2007-09-24 8:45 [RFC][PATCH] mm: couple rcu and memory reclaim Peter Zijlstra @ 2007-09-24 10:42 ` Balbir Singh 2007-09-24 11:06 ` Peter Zijlstra 2007-09-28 20:13 ` Nick Piggin 1 sibling, 1 reply; 8+ messages in thread From: Balbir Singh @ 2007-09-24 10:42 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Paul E. McKenney, linux-kernel, Nick Piggin, Rik van Riel Peter Zijlstra wrote: > > Just an idea I had, it seems like a good idea to wait for RCU callbacks > in reclaim so that we won't get all of memory stuck there. > > If this location is too aggressive we might stick it next to > disable_swap_token(). > > --- > Couple RCU and reclaim. > > There could be a lot of memory stuck in RCU callbacks. Wait for RCU to > finish before giving it another go. > > Placed in kswapd and not direct reclaim path because kswapd never holds > rcu_read_lock() at this point and can thus not deadlock. Direct reclaim > callers might hold rcu_read_lock() and would suffer from deadlocks if > sync_rcu() were to be called. > > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> > --- > mm/vmscan.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > Index: linux-2.6/mm/vmscan.c > =================================================================== > --- linux-2.6.orig/mm/vmscan.c > +++ linux-2.6/mm/vmscan.c > @@ -1435,8 +1435,10 @@ loop_again: > unsigned long lru_pages = 0; > > /* The swap token gets in the way of swapout... */ > - if (!priority) > + if (!priority) { > + synchronize_rcu(); Interesting change 1. I suspect that synchronize_rcu() is most likely to free up slab pages, so shrink_slab() will clean up all the freed pages. Could we add a comment to indicate the same? 2. Shouldn't we do this in balance_pgdat() as well? > disable_swap_token(); > + } > > all_zones_ok = 1; > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC][PATCH] mm: couple rcu and memory reclaim 2007-09-24 10:42 ` Balbir Singh @ 2007-09-24 11:06 ` Peter Zijlstra 2007-09-24 11:22 ` Balbir Singh 0 siblings, 1 reply; 8+ messages in thread From: Peter Zijlstra @ 2007-09-24 11:06 UTC (permalink / raw) To: balbir; +Cc: Paul E. McKenney, linux-kernel, Nick Piggin, Rik van Riel On Mon, 24 Sep 2007 16:12:19 +0530 Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > Peter Zijlstra wrote: > > > > Just an idea I had, it seems like a good idea to wait for RCU callbacks > > in reclaim so that we won't get all of memory stuck there. > > > > If this location is too aggressive we might stick it next to > > disable_swap_token(). > > > > --- > > Couple RCU and reclaim. > > > > There could be a lot of memory stuck in RCU callbacks. Wait for RCU to > > finish before giving it another go. > > > > Placed in kswapd and not direct reclaim path because kswapd never holds > > rcu_read_lock() at this point and can thus not deadlock. Direct reclaim > > callers might hold rcu_read_lock() and would suffer from deadlocks if > > sync_rcu() were to be called. > > > > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> > > --- > > mm/vmscan.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > Index: linux-2.6/mm/vmscan.c > > =================================================================== > > --- linux-2.6.orig/mm/vmscan.c > > +++ linux-2.6/mm/vmscan.c > > @@ -1435,8 +1435,10 @@ loop_again: > > unsigned long lru_pages = 0; > > > > /* The swap token gets in the way of swapout... */ > > - if (!priority) > > + if (!priority) { > > + synchronize_rcu(); Bah, it seems I send the wrong patch out :-/ this is the one against disable_swap_token(). I meant to send out this one: @@ -1527,8 +1527,10 @@ loop_again: * OK, kswapd is getting into trouble. Take a nap, then take * another pass across the zones. */ - if (total_scanned && priority < DEF_PRIORITY - 2) + if (total_scanned && priority < DEF_PRIORITY - 2) { + synchronize_rcu(); congestion_wait(WRITE, HZ/10); + } > Interesting change > > 1. I suspect that synchronize_rcu() is most likely to free up > slab pages, so shrink_slab() will clean up all the freed > pages. Could we add a comment to indicate the same? Yes indeed, will add such a comment. > 2. Shouldn't we do this in balance_pgdat() as well? Uhm, this is balance_pgdat() (both these changes) :-) Only kswapd can do this, direct reclaim has deadlock potential. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC][PATCH] mm: couple rcu and memory reclaim 2007-09-24 11:06 ` Peter Zijlstra @ 2007-09-24 11:22 ` Balbir Singh 2007-09-24 11:50 ` Peter Zijlstra 0 siblings, 1 reply; 8+ messages in thread From: Balbir Singh @ 2007-09-24 11:22 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Paul E. McKenney, linux-kernel, Nick Piggin, Rik van Riel Peter Zijlstra wrote: > On Mon, 24 Sep 2007 16:12:19 +0530 Balbir Singh > <balbir@linux.vnet.ibm.com> wrote: > >> Peter Zijlstra wrote: >>> Just an idea I had, it seems like a good idea to wait for RCU callbacks >>> in reclaim so that we won't get all of memory stuck there. >>> >>> If this location is too aggressive we might stick it next to >>> disable_swap_token(). >>> >>> --- >>> Couple RCU and reclaim. >>> >>> There could be a lot of memory stuck in RCU callbacks. Wait for RCU to >>> finish before giving it another go. >>> >>> Placed in kswapd and not direct reclaim path because kswapd never holds >>> rcu_read_lock() at this point and can thus not deadlock. Direct reclaim >>> callers might hold rcu_read_lock() and would suffer from deadlocks if >>> sync_rcu() were to be called. >>> >>> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> >>> --- >>> mm/vmscan.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> Index: linux-2.6/mm/vmscan.c >>> =================================================================== >>> --- linux-2.6.orig/mm/vmscan.c >>> +++ linux-2.6/mm/vmscan.c >>> @@ -1435,8 +1435,10 @@ loop_again: >>> unsigned long lru_pages = 0; >>> >>> /* The swap token gets in the way of swapout... */ >>> - if (!priority) >>> + if (!priority) { >>> + synchronize_rcu(); > > Bah, it seems I send the wrong patch out :-/ > Looks like I reviewed the wrong thing then :-) > this is the one against disable_swap_token(). I meant to send out this > one: > > @@ -1527,8 +1527,10 @@ loop_again: > * OK, kswapd is getting into trouble. Take a nap, then take > * another pass across the zones. > */ > - if (total_scanned && priority < DEF_PRIORITY - 2) > + if (total_scanned && priority < DEF_PRIORITY - 2) { > + synchronize_rcu(); > congestion_wait(WRITE, HZ/10); > + } > > >> Interesting change >> >> 1. I suspect that synchronize_rcu() is most likely to free up >> slab pages, so shrink_slab() will clean up all the freed >> pages. Could we add a comment to indicate the same? > > Yes indeed, will add such a comment. > >> 2. Shouldn't we do this in balance_pgdat() as well? > > Uhm, this is balance_pgdat() (both these changes) :-) > Hmm.. Could you please generate the diff with -p. > Only kswapd can do this, direct reclaim has deadlock potential. Yes, but not in all cases, do you want to add any gfp_mask based smartness for direct reclaim? -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC][PATCH] mm: couple rcu and memory reclaim 2007-09-24 11:22 ` Balbir Singh @ 2007-09-24 11:50 ` Peter Zijlstra 2007-09-24 12:48 ` Balbir Singh 0 siblings, 1 reply; 8+ messages in thread From: Peter Zijlstra @ 2007-09-24 11:50 UTC (permalink / raw) To: balbir; +Cc: Paul E. McKenney, linux-kernel, Nick Piggin, Rik van Riel On Mon, 24 Sep 2007 16:52:15 +0530 Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > Peter Zijlstra wrote: > > On Mon, 24 Sep 2007 16:12:19 +0530 Balbir Singh > > <balbir@linux.vnet.ibm.com> wrote: > > > >> Peter Zijlstra wrote: > >>> Just an idea I had, it seems like a good idea to wait for RCU callbacks > >>> in reclaim so that we won't get all of memory stuck there. > >>> > >>> If this location is too aggressive we might stick it next to > >>> disable_swap_token(). > >>> > >>> --- > >>> Couple RCU and reclaim. > >>> > >>> There could be a lot of memory stuck in RCU callbacks. Wait for RCU to > >>> finish before giving it another go. > >>> > >>> Placed in kswapd and not direct reclaim path because kswapd never holds > >>> rcu_read_lock() at this point and can thus not deadlock. Direct reclaim > >>> callers might hold rcu_read_lock() and would suffer from deadlocks if > >>> sync_rcu() were to be called. > >>> > >>> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> > >>> --- --- mm/vmscan.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Index: linux-2.6/mm/vmscan.c =================================================================== --- linux-2.6.orig/mm/vmscan.c +++ linux-2.6/mm/vmscan.c @@ -1527,8 +1527,10 @@ loop_again: * OK, kswapd is getting into trouble. Take a nap, then take * another pass across the zones. */ - if (total_scanned && priority < DEF_PRIORITY - 2) + if (total_scanned && priority < DEF_PRIORITY - 2) { + synchronize_rcu(); congestion_wait(WRITE, HZ/10); + } /* * We do this so kswapd doesn't build up large priorities for > > Only kswapd can do this, direct reclaim has deadlock potential. > > Yes, but not in all cases, do you want to add any gfp_mask > based smartness for direct reclaim? gfp_mask doesn't carry the needed information. It depends on whether the current context holds a rcu_read_lock(). so something like: rcu_read_lock() foo = kmalloc(sizeof(foo)) new_slab() __alloc_pages() try_to_free_pages() synchronise_rcu() <-- deadlock rcu_read_unlock() ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC][PATCH] mm: couple rcu and memory reclaim 2007-09-24 11:50 ` Peter Zijlstra @ 2007-09-24 12:48 ` Balbir Singh 2007-09-24 12:58 ` Peter Zijlstra 0 siblings, 1 reply; 8+ messages in thread From: Balbir Singh @ 2007-09-24 12:48 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Paul E. McKenney, linux-kernel, Nick Piggin, Rik van Riel Peter Zijlstra wrote: >>> Only kswapd can do this, direct reclaim has deadlock potential. >> Yes, but not in all cases, do you want to add any gfp_mask >> based smartness for direct reclaim? > > gfp_mask doesn't carry the needed information. It depends on whether > the current context holds a rcu_read_lock(). > What I meant was that nobody would hold rcu_read_lock() and pass gfp_mask of GFP_KERNEL in scan_control or to do_try_to_free_pages() > so something like: > > rcu_read_lock() > foo = kmalloc(sizeof(foo)) At this point, you really can't use GFP_KERNEL, since rcu_read_lock() disables pre-emption in the current kernel, ideally you should see a might_sleep() BUG. > new_slab() > __alloc_pages() > try_to_free_pages() > synchronise_rcu() <-- deadlock > rcu_read_unlock() -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC][PATCH] mm: couple rcu and memory reclaim 2007-09-24 12:48 ` Balbir Singh @ 2007-09-24 12:58 ` Peter Zijlstra 0 siblings, 0 replies; 8+ messages in thread From: Peter Zijlstra @ 2007-09-24 12:58 UTC (permalink / raw) To: balbir; +Cc: Paul E. McKenney, linux-kernel, Nick Piggin, Rik van Riel On Mon, 24 Sep 2007 18:18:30 +0530 Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > Peter Zijlstra wrote: > >>> Only kswapd can do this, direct reclaim has deadlock potential. > >> Yes, but not in all cases, do you want to add any gfp_mask > >> based smartness for direct reclaim? > > > > gfp_mask doesn't carry the needed information. It depends on whether > > the current context holds a rcu_read_lock(). > > > > What I meant was that nobody would hold rcu_read_lock() and pass > gfp_mask of GFP_KERNEL in scan_control or to do_try_to_free_pages() > > > so something like: > > > > rcu_read_lock() > > foo = kmalloc(sizeof(foo)) > > At this point, you really can't use GFP_KERNEL, since rcu_read_lock() > disables pre-emption in the current kernel, ideally you should see > a might_sleep() BUG. > > > new_slab() > > __alloc_pages() > > try_to_free_pages() > > synchronise_rcu() <-- deadlock > > rcu_read_unlock() I guess I've been using preemptibe rcu for way too long. Would need to clean up some of -rt to make this true there, but should be doable indeed. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC][PATCH] mm: couple rcu and memory reclaim 2007-09-24 8:45 [RFC][PATCH] mm: couple rcu and memory reclaim Peter Zijlstra 2007-09-24 10:42 ` Balbir Singh @ 2007-09-28 20:13 ` Nick Piggin 1 sibling, 0 replies; 8+ messages in thread From: Nick Piggin @ 2007-09-28 20:13 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Paul E. McKenney, linux-kernel, Rik van Riel On Monday 24 September 2007 18:45, Peter Zijlstra wrote: > Just an idea I had, it seems like a good idea to wait for RCU callbacks > in reclaim so that we won't get all of memory stuck there. I think it would be much too aggressive (_especially_ with preemptible RCU, I would have thought?). And not just for the case of adding a lot of idle time while other CPUs are busy not being preempted -- also that if you have lots of CPUs reclaiming and asking to synchronize_rcu, then you could have global RCU state getting ping-ponged. > If this location is too aggressive we might stick it next to > disable_swap_token(). I don't see a really good reason to add this kind of wait here. I mean, we have to wait for RCU anyway, so we may as well scan what we can until then? Putting it in a slowpath at some point when we think we're heading for OOM I think would be a bit better. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-09-29 12:45 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-09-24 8:45 [RFC][PATCH] mm: couple rcu and memory reclaim Peter Zijlstra 2007-09-24 10:42 ` Balbir Singh 2007-09-24 11:06 ` Peter Zijlstra 2007-09-24 11:22 ` Balbir Singh 2007-09-24 11:50 ` Peter Zijlstra 2007-09-24 12:48 ` Balbir Singh 2007-09-24 12:58 ` Peter Zijlstra 2007-09-28 20:13 ` Nick Piggin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox