From: Pavel Machek <pavel@ucw.cz>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: "P. Christeas" <xrg@linux.gr>,
linux-mm@kvack.org, Joonsoo Kim <iamjoonsoo.kim@lge.com>,
lkml <linux-kernel@vger.kernel.org>,
David Rientjes <rientjes@google.com>,
Norbert Preining <preining@logic.at>,
Markus Trippelsdorf <markus@trippelsdorf.de>
Subject: Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c
Date: Sun, 9 Nov 2014 09:27:46 +0100 [thread overview]
Message-ID: <20141109082746.GA3402@amd> (raw)
In-Reply-To: <545E96BD.5040103@suse.cz>
[-- Attachment #1: Type: text/plain, Size: 2258 bytes --]
Hi!
> >> Oh and did I ask in this thread for /proc/zoneinfo yet? :)
> >
> > Using that same kernel[1], got again into a race, gathered a few more data.
> >
> > This time, I had 1x "urpmq" process [2] hung at 100% CPU , when "kwin" got
> > apparently blocked (100% CPU, too) trying to resize a GUI window. I suppose
> > the resizing operation would mean heavy memory alloc/free.
> >
> > The rest of the system was responsive, I could easily get a console, login,
> > gather the files.. Then, I have *killed* -9 the "urpmq" process, which solved
> > the race and my system is still alive! "kwin" is still running, returned to
> > regular CPU load.
> >
> > Attached is traces from SysRq+l (pressed a few times, wanted to "snapshot" the
> > stack) and /proc/zoneinfo + /proc/vmstat
> >
> > Bisection is not yet meaningful, IMHO, because I cannot be sure that "good"
> > points are really free from this issue. I'd estimate that each test would take
> > +3days, unless I really find a deterministic way to reproduce the issue .
>
> Hi,
>
> I think I finally found the cause by staring into the code... CCing
> people from all 4 separate threads I know about this issue.
> The problem with finding the cause was that the first report I got from
> Markus was about isolate_freepages_block() overhead, and later Norbert
> reported that reverting a patch for isolate_freepages* helped. But the
> problem seems to be that although the loop in isolate_migratepages exits
> because the scanners almost meet (they are within same pageblock), they
> don't truly meet, therefore compact_finished() decides to continue, but
> isolate_migratepages() exits immediately... boom! But indeed e14c720efdd7
> made this situation possible, as free scaner pfn can now point to a
> middle of pageblock.
Ok, it seems it happened second time now, again shortly after
resume. I guess I should apply your patch after all.
(Or... instead it should go to Linus ASAP -- it fixes known problem
that is affected people, and we want it in soon in case it is not
complete fix.)
Dmesg is in the attachment, perhaps it helps.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
[-- Attachment #2: delme.gz --]
[-- Type: application/gzip, Size: 18436 bytes --]
next prev parent reply other threads:[~2014-11-09 8:27 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-04 7:26 Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c P. Christeas
2014-11-04 8:55 ` Vlastimil Babka
2014-11-04 8:55 ` Vlastimil Babka
2014-11-04 9:36 ` P. Christeas
2014-11-05 15:26 ` Vlastimil Babka
2014-11-05 15:26 ` Vlastimil Babka
2014-11-05 16:02 ` P. Christeas
2014-11-05 16:02 ` P. Christeas
2014-11-06 19:23 ` P. Christeas
2014-11-06 21:38 ` Vlastimil Babka
2014-11-06 21:38 ` Vlastimil Babka
2014-11-08 13:11 ` P. Christeas
2014-11-08 22:18 ` Vlastimil Babka
2014-11-08 22:18 ` Vlastimil Babka
2014-11-09 8:27 ` Pavel Machek [this message]
2014-11-09 9:43 ` Vlastimil Babka
2014-11-09 9:43 ` Vlastimil Babka
2014-11-09 22:32 ` Norbert Preining
2014-11-09 22:32 ` Norbert Preining
2014-11-10 6:07 ` Joonsoo Kim
2014-11-10 6:07 ` Joonsoo Kim
2014-11-10 7:53 ` Vlastimil Babka
2014-11-10 7:53 ` Vlastimil Babka
2014-11-10 8:05 ` Joonsoo Kim
2014-11-10 8:05 ` Joonsoo Kim
2014-11-10 8:14 ` P. Christeas
2014-11-10 8:14 ` P. Christeas
-- strict thread matches above, loose matches on Subject: below --
2014-11-09 4:47 Hillf Danton
2014-11-09 4:47 ` Hillf Danton
2014-11-09 8:22 ` P. Christeas
2014-11-09 8:22 ` P. Christeas
2014-11-09 9:35 ` Vlastimil Babka
2014-11-09 9:35 ` Vlastimil Babka
2014-11-10 3:23 ` Hillf Danton
2014-11-10 3:23 ` Hillf Danton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141109082746.GA3402@amd \
--to=pavel@ucw.cz \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=markus@trippelsdorf.de \
--cc=preining@logic.at \
--cc=rientjes@google.com \
--cc=vbabka@suse.cz \
--cc=xrg@linux.gr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.