* [patch] mm, memcg: periodically schedule when emptying page list
@ 2014-06-02 23:13 David Rientjes
[not found] ` <alpine.DEB.2.02.1406021612550.6487-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: David Rientjes @ 2014-06-02 23:13 UTC (permalink / raw)
To: Andrew Morton
Cc: Johannes Weiner, Michal Hocko, linux-kernel, linux-mm, cgroups
mem_cgroup_force_empty_list() can iterate a large number of pages on an lru and
mem_cgroup_move_parent() doesn't return an errno unless certain criteria, none
of which indicate that the iteration may be taking too long, is met.
We have encountered the following stack trace many times indicating
"need_resched set for > 51000020 ns (51 ticks) without schedule", for example:
scheduler_tick()
<timer irq>
mem_cgroup_move_account+0x4d/0x1d5
mem_cgroup_move_parent+0x8d/0x109
mem_cgroup_reparent_charges+0x149/0x2ba
mem_cgroup_css_offline+0xeb/0x11b
cgroup_offline_fn+0x68/0x16b
process_one_work+0x129/0x350
If this iteration is taking too long, indicated by need_resched(), then
periodically schedule and continue from where we last left off.
Signed-off-by: David Rientjes <rientjes@google.com>
---
mm/memcontrol.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4764,6 +4764,7 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
do {
struct page_cgroup *pc;
struct page *page;
+ int ret;
spin_lock_irqsave(&zone->lru_lock, flags);
if (list_empty(list)) {
@@ -4781,8 +4782,13 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
pc = lookup_page_cgroup(page);
- if (mem_cgroup_move_parent(page, pc, memcg)) {
- /* found lock contention or "pc" is obsolete. */
+ ret = mem_cgroup_move_parent(page, pc, memcg);
+ if (ret || need_resched()) {
+ /*
+ * Couldn't grab the page reference, isolate the page,
+ * there was a pc mismatch, or we simply need to
+ * schedule because this is taking too long.
+ */
busy = page;
cond_resched();
} else
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread[parent not found: <alpine.DEB.2.02.1406021612550.6487-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>]
* Re: [patch] mm, memcg: periodically schedule when emptying page list [not found] ` <alpine.DEB.2.02.1406021612550.6487-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org> @ 2014-06-02 23:44 ` Hugh Dickins 2014-06-02 23:48 ` David Rientjes 2014-06-03 0:51 ` [patch v2] " David Rientjes 1 sibling, 1 reply; 7+ messages in thread From: Hugh Dickins @ 2014-06-02 23:44 UTC (permalink / raw) To: David Rientjes Cc: Andrew Morton, Johannes Weiner, Michal Hocko, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA On Mon, 2 Jun 2014, David Rientjes wrote: > mem_cgroup_force_empty_list() can iterate a large number of pages on an lru and > mem_cgroup_move_parent() doesn't return an errno unless certain criteria, none > of which indicate that the iteration may be taking too long, is met. > > We have encountered the following stack trace many times indicating > "need_resched set for > 51000020 ns (51 ticks) without schedule", for example: > > scheduler_tick() > <timer irq> > mem_cgroup_move_account+0x4d/0x1d5 > mem_cgroup_move_parent+0x8d/0x109 > mem_cgroup_reparent_charges+0x149/0x2ba > mem_cgroup_css_offline+0xeb/0x11b > cgroup_offline_fn+0x68/0x16b > process_one_work+0x129/0x350 > > If this iteration is taking too long, indicated by need_resched(), then > periodically schedule and continue from where we last left off. > > Signed-off-by: David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> > --- > mm/memcontrol.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -4764,6 +4764,7 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg, > do { > struct page_cgroup *pc; > struct page *page; > + int ret; > > spin_lock_irqsave(&zone->lru_lock, flags); > if (list_empty(list)) { > @@ -4781,8 +4782,13 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg, > > pc = lookup_page_cgroup(page); > > - if (mem_cgroup_move_parent(page, pc, memcg)) { > - /* found lock contention or "pc" is obsolete. */ > + ret = mem_cgroup_move_parent(page, pc, memcg); > + if (ret || need_resched()) { > + /* > + * Couldn't grab the page reference, isolate the page, > + * there was a pc mismatch, or we simply need to > + * schedule because this is taking too long. > + */ > busy = page; > cond_resched(); > } else Why not just move that cond_resched() down below the if/else? No need to test need_resched() separately, and this page is not busy. Hugh ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] mm, memcg: periodically schedule when emptying page list 2014-06-02 23:44 ` Hugh Dickins @ 2014-06-02 23:48 ` David Rientjes 2014-06-03 0:01 ` Hugh Dickins 0 siblings, 1 reply; 7+ messages in thread From: David Rientjes @ 2014-06-02 23:48 UTC (permalink / raw) To: Hugh Dickins Cc: Andrew Morton, Johannes Weiner, Michal Hocko, linux-kernel, linux-mm, cgroups On Mon, 2 Jun 2014, Hugh Dickins wrote: > Why not just move that cond_resched() down below the if/else? > No need to test need_resched() separately, and this page is not busy. > Would you like to propose your version from our kernel instead? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] mm, memcg: periodically schedule when emptying page list 2014-06-02 23:48 ` David Rientjes @ 2014-06-03 0:01 ` Hugh Dickins 0 siblings, 0 replies; 7+ messages in thread From: Hugh Dickins @ 2014-06-03 0:01 UTC (permalink / raw) To: David Rientjes Cc: Hugh Dickins, Andrew Morton, Johannes Weiner, Michal Hocko, linux-kernel, linux-mm, cgroups On Mon, 2 Jun 2014, David Rientjes wrote: > On Mon, 2 Jun 2014, Hugh Dickins wrote: > > > Why not just move that cond_resched() down below the if/else? > > No need to test need_resched() separately, and this page is not busy. > > > > Would you like to propose your version from our kernel instead? If that's how you prefer to work it, sure. The patch would indeed look uncannily like what I put in our kernel a week ago; but I think the description might owe a lot to you! I'll get to it later, while secretly hoping you get to it sooner :) Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* [patch v2] mm, memcg: periodically schedule when emptying page list [not found] ` <alpine.DEB.2.02.1406021612550.6487-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org> 2014-06-02 23:44 ` Hugh Dickins @ 2014-06-03 0:51 ` David Rientjes 2014-06-03 1:15 ` Johannes Weiner 2014-06-03 6:27 ` Michal Hocko 1 sibling, 2 replies; 7+ messages in thread From: David Rientjes @ 2014-06-03 0:51 UTC (permalink / raw) To: Andrew Morton Cc: Hugh Dickins, Johannes Weiner, Michal Hocko, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA From: Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> mem_cgroup_force_empty_list() can iterate a large number of pages on an lru and mem_cgroup_move_parent() doesn't return an errno unless certain criteria, none of which indicate that the iteration may be taking too long, is met. We have encountered the following stack trace many times indicating "need_resched set for > 51000020 ns (51 ticks) without schedule", for example: scheduler_tick() <timer irq> mem_cgroup_move_account+0x4d/0x1d5 mem_cgroup_move_parent+0x8d/0x109 mem_cgroup_reparent_charges+0x149/0x2ba mem_cgroup_css_offline+0xeb/0x11b cgroup_offline_fn+0x68/0x16b process_one_work+0x129/0x350 If this iteration is taking too long, we still need to do cond_resched() even when an individual page is not busy. [rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org: changelog] Signed-off-by: Hugh Dickins <hughd-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> Signed-off-by: David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> --- v2: always reschedule if needed, "page" itself may not have a pc mismatch or been unable to isolate. mm/memcontrol.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4784,9 +4784,9 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg, if (mem_cgroup_move_parent(page, pc, memcg)) { /* found lock contention or "pc" is obsolete. */ busy = page; - cond_resched(); } else busy = NULL; + cond_resched(); } while (!list_empty(list)); } ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch v2] mm, memcg: periodically schedule when emptying page list 2014-06-03 0:51 ` [patch v2] " David Rientjes @ 2014-06-03 1:15 ` Johannes Weiner 2014-06-03 6:27 ` Michal Hocko 1 sibling, 0 replies; 7+ messages in thread From: Johannes Weiner @ 2014-06-03 1:15 UTC (permalink / raw) To: David Rientjes Cc: Andrew Morton, Hugh Dickins, Michal Hocko, linux-kernel, linux-mm, cgroups On Mon, Jun 02, 2014 at 05:51:25PM -0700, David Rientjes wrote: > From: Hugh Dickins <hughd@google.com> > > mem_cgroup_force_empty_list() can iterate a large number of pages on an lru and > mem_cgroup_move_parent() doesn't return an errno unless certain criteria, none > of which indicate that the iteration may be taking too long, is met. > > We have encountered the following stack trace many times indicating > "need_resched set for > 51000020 ns (51 ticks) without schedule", for example: > > scheduler_tick() > <timer irq> > mem_cgroup_move_account+0x4d/0x1d5 > mem_cgroup_move_parent+0x8d/0x109 > mem_cgroup_reparent_charges+0x149/0x2ba > mem_cgroup_css_offline+0xeb/0x11b > cgroup_offline_fn+0x68/0x16b > process_one_work+0x129/0x350 > > If this iteration is taking too long, we still need to do cond_resched() even > when an individual page is not busy. > > [rientjes@google.com: changelog] > Signed-off-by: Hugh Dickins <hughd@google.com> > Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch v2] mm, memcg: periodically schedule when emptying page list 2014-06-03 0:51 ` [patch v2] " David Rientjes 2014-06-03 1:15 ` Johannes Weiner @ 2014-06-03 6:27 ` Michal Hocko 1 sibling, 0 replies; 7+ messages in thread From: Michal Hocko @ 2014-06-03 6:27 UTC (permalink / raw) To: David Rientjes Cc: Andrew Morton, Hugh Dickins, Johannes Weiner, linux-kernel, linux-mm, cgroups On Mon 02-06-14 17:51:25, David Rientjes wrote: > From: Hugh Dickins <hughd@google.com> > > mem_cgroup_force_empty_list() can iterate a large number of pages on an lru and > mem_cgroup_move_parent() doesn't return an errno unless certain criteria, none > of which indicate that the iteration may be taking too long, is met. > > We have encountered the following stack trace many times indicating > "need_resched set for > 51000020 ns (51 ticks) without schedule", for example: > > scheduler_tick() > <timer irq> > mem_cgroup_move_account+0x4d/0x1d5 > mem_cgroup_move_parent+0x8d/0x109 > mem_cgroup_reparent_charges+0x149/0x2ba > mem_cgroup_css_offline+0xeb/0x11b > cgroup_offline_fn+0x68/0x16b > process_one_work+0x129/0x350 > > If this iteration is taking too long, we still need to do cond_resched() even > when an individual page is not busy. > > [rientjes@google.com: changelog] > Signed-off-by: Hugh Dickins <hughd@google.com> > Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.cz> Thanks! > --- > v2: always reschedule if needed, "page" itself may not have a pc mismatch > or been unable to isolate. > > mm/memcontrol.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -4784,9 +4784,9 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg, > if (mem_cgroup_move_parent(page, pc, memcg)) { > /* found lock contention or "pc" is obsolete. */ > busy = page; > - cond_resched(); > } else > busy = NULL; > + cond_resched(); > } while (!list_empty(list)); > } > -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-06-03 6:27 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-02 23:13 [patch] mm, memcg: periodically schedule when emptying page list David Rientjes
[not found] ` <alpine.DEB.2.02.1406021612550.6487-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2014-06-02 23:44 ` Hugh Dickins
2014-06-02 23:48 ` David Rientjes
2014-06-03 0:01 ` Hugh Dickins
2014-06-03 0:51 ` [patch v2] " David Rientjes
2014-06-03 1:15 ` Johannes Weiner
2014-06-03 6:27 ` Michal Hocko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox