From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Greg Thelen <gthelen@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
Mel Gorman <mel@csn.ul.ie>,
linux-mm@kvack.org,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [PATCH] vmscan: Fix do_try_to_free_pages() return value when priority==0 reclaim failure
Date: Tue, 1 Jun 2010 13:40:59 +0530 [thread overview]
Message-ID: <20100601081059.GA2804@balbir.in.ibm.com> (raw)
In-Reply-To: <20100601122140.2436.A69D9226@jp.fujitsu.com>
* KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> [2010-06-01 12:29:41]:
> CC to memcg folks.
>
> > I agree with the direction of this patch, but I am seeing a hang when
> > testing with mmotm-2010-05-21-16-05. The following test hangs, unless I
> > remove this patch from mmotm:
> > mount -t cgroup none /cgroups -o memory
> > mkdir /cgroups/cg1
> > echo $$ > /cgroups/cg1/tasks
> > dd bs=1024 count=1024 if=/dev/null of=/data/foo
> > echo $$ > /cgroups/tasks
> > echo 1 > /cgroups/cg1/memory.force_empty
> >
> > I think the hang is caused by the following portion of
> > mem_cgroup_force_empty():
> > while (nr_retries && mem->res.usage > 0) {
> > int progress;
> >
> > if (signal_pending(current)) {
> > ret = -EINTR;
> > goto out;
> > }
> > progress = try_to_free_mem_cgroup_pages(mem, GFP_KERNEL,
> > false, get_swappiness(mem));
> > if (!progress) {
> > nr_retries--;
> > /* maybe some writeback is necessary */
> > congestion_wait(BLK_RW_ASYNC, HZ/10);
> > }
> >
> > }
> >
> > With this patch applied, it is possible that when do_try_to_free_pages()
> > calls shrink_zones() for priority 0 that shrink_zones() may return 1
> > indicating progress, even though no pages may have been reclaimed.
> > Because this is a cgroup operation, scanning_global_lru() is false and
> > the following portion of do_try_to_free_pages() fails to set ret=0.
> > > if (ret && scanning_global_lru(sc))
> > > ret = sc->nr_reclaimed;
> > This leaves ret=1 indicating that do_try_to_free_pages() reclaimed 1
> > page even though it did not reclaim any pages. Therefore
> > mem_cgroup_force_empty() erroneously believes that
> > try_to_free_mem_cgroup_pages() is making progress (one page at a time),
> > so there is an endless loop.
>
> Good catch!
>
> Yeah, your analysis is fine. thank you for both your testing and
> making analysis.
>
> Unfortunatelly, this logic need more fix. because It have already been
> corrupted by another regression. my point is, if priority==0 reclaim
> failure occur, "ret = sc->nr_reclaimed" makes no sense at all.
>
> The fixing patch is here. What do you think?
>
>
>
> From 49a395b21fe1b2f864112e71d027ffcafbdc9fc0 Mon Sep 17 00:00:00 2001
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Date: Tue, 1 Jun 2010 11:29:50 +0900
> Subject: [PATCH] vmscan: Fix do_try_to_free_pages() return value when priority==0 reclaim failure
>
> Greg Thelen reported recent Johannes's stack diet patch makes kernel
> hang. His test is following.
>
> mount -t cgroup none /cgroups -o memory
> mkdir /cgroups/cg1
> echo $$ > /cgroups/cg1/tasks
> dd bs=1024 count=1024 if=/dev/null of=/data/foo
> echo $$ > /cgroups/tasks
> echo 1 > /cgroups/cg1/memory.force_empty
>
> Actually, This OOM hard to try logic have been corrupted
> since following two years old patch.
>
> commit a41f24ea9fd6169b147c53c2392e2887cc1d9247
> Author: Nishanth Aravamudan <nacc@us.ibm.com>
> Date: Tue Apr 29 00:58:25 2008 -0700
>
> page allocator: smarter retry of costly-order allocations
>
> Original intention was "return success if the system have shrinkable
> zones though priority==0 reclaim was failure". But the above patch
> changed to "return nr_reclaimed if .....". Oh, That forgot nr_reclaimed
> may be 0 if priority==0 reclaim failure.
>
> And Johannes's patch made more corrupt. Originally, priority==0 recliam
> failure on memcg return 0, but this patch changed to return 1. It
> totally confused memcg.
>
> This patch fixes it completely.
>
The patch seems reasonable to me, although I've not tested it
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
--
Three Cheers,
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-06-01 12:24 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-30 23:05 [patch 0/5] vmscan: cut down on struct scan_control Johannes Weiner
2010-04-30 23:05 ` [patch 1/5] vmscan: fix unmapping behaviour for RECLAIM_SWAP Johannes Weiner
2010-05-13 3:02 ` KOSAKI Motohiro
2010-05-19 21:32 ` Johannes Weiner
2010-04-30 23:05 ` [patch 2/5] vmscan: remove may_unmap scan control Johannes Weiner
2010-04-30 23:05 ` [patch 3/5] vmscan: remove all_unreclaimable " Johannes Weiner
2010-05-13 3:25 ` KOSAKI Motohiro
2010-05-19 21:34 ` Johannes Weiner
2010-05-31 18:32 ` Greg Thelen
2010-06-01 3:29 ` [PATCH] vmscan: Fix do_try_to_free_pages() return value when priority==0 reclaim failure KOSAKI Motohiro
2010-06-01 6:48 ` KAMEZAWA Hiroyuki
2010-06-01 8:10 ` Balbir Singh [this message]
2010-06-02 0:33 ` KAMEZAWA Hiroyuki
2010-06-01 14:50 ` Greg Thelen
2010-06-04 14:32 ` Johannes Weiner
2010-04-30 23:05 ` [patch 4/5] vmscan: remove isolate_pages callback scan control Johannes Weiner
2010-05-13 3:29 ` KOSAKI Motohiro
2010-05-19 21:42 ` Johannes Weiner
2010-05-20 23:23 ` KOSAKI Motohiro
2010-04-30 23:05 ` [patch 5/5] vmscan: remove may_swap " Johannes Weiner
2010-05-13 3:36 ` KOSAKI Motohiro
2010-05-19 21:44 ` Johannes Weiner
2010-05-21 0:15 ` KOSAKI Motohiro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100601081059.GA2804@balbir.in.ibm.com \
--to=balbir@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.