From mboxrd@z Thu Jan 1 00:00:00 1970 From: Hugh Dickins Subject: Re: [patch] mm, memcg: periodically schedule when emptying page list Date: Mon, 2 Jun 2014 16:44:34 -0700 (PDT) Message-ID: References: Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version:content-type; bh=mBpnYmNNHVV75pJqP9CHoQyEAKqacqRkiJtI/EjI+z4=; b=M8/HRvHf3gkBJaUH5OHNxbMW3vJMIlZTnBVtzhnQIj6ZeEbPq20Csw/vwuQ3HuV4E/ QSEu5TGZcF1a5KNSsPnYbR9Twh+NwEXKGRYJ4xsoGsT8uQF352pW8z3pYCeI3AeikCl9 SorR/nSmUUkDMHkH1tvrF7ZBgvmOkGLA9B95ufGmdbsDEFvPgMQ9uKsHPYFpfAbJwp8N 2Sf/Egb6pjogVRMQh5zTI9e/gDLg7wKQGAmHzGcmYxfnrHHY74dj4FrJ63NvOcSINnjf sRxwMe4EtYPK8Fatia+m0t/bIxa7GGTfhlEpvhRwA5qXz9912VcMrPczi9sXn/FRZbGk 2YRQ== In-Reply-To: Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: TEXT/PLAIN; charset="us-ascii" Content-Transfer-Encoding: 7bit To: David Rientjes Cc: Andrew Morton , Johannes Weiner , Michal Hocko , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Mon, 2 Jun 2014, David Rientjes wrote: > mem_cgroup_force_empty_list() can iterate a large number of pages on an lru and > mem_cgroup_move_parent() doesn't return an errno unless certain criteria, none > of which indicate that the iteration may be taking too long, is met. > > We have encountered the following stack trace many times indicating > "need_resched set for > 51000020 ns (51 ticks) without schedule", for example: > > scheduler_tick() > > mem_cgroup_move_account+0x4d/0x1d5 > mem_cgroup_move_parent+0x8d/0x109 > mem_cgroup_reparent_charges+0x149/0x2ba > mem_cgroup_css_offline+0xeb/0x11b > cgroup_offline_fn+0x68/0x16b > process_one_work+0x129/0x350 > > If this iteration is taking too long, indicated by need_resched(), then > periodically schedule and continue from where we last left off. > > Signed-off-by: David Rientjes > --- > mm/memcontrol.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -4764,6 +4764,7 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg, > do { > struct page_cgroup *pc; > struct page *page; > + int ret; > > spin_lock_irqsave(&zone->lru_lock, flags); > if (list_empty(list)) { > @@ -4781,8 +4782,13 @@ static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg, > > pc = lookup_page_cgroup(page); > > - if (mem_cgroup_move_parent(page, pc, memcg)) { > - /* found lock contention or "pc" is obsolete. */ > + ret = mem_cgroup_move_parent(page, pc, memcg); > + if (ret || need_resched()) { > + /* > + * Couldn't grab the page reference, isolate the page, > + * there was a pc mismatch, or we simply need to > + * schedule because this is taking too long. > + */ > busy = page; > cond_resched(); > } else Why not just move that cond_resched() down below the if/else? No need to test need_resched() separately, and this page is not busy. Hugh