From: Johannes Stezenbach <js@sig21.net>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, mhocko@kernel.org
Subject: Re: 4.6.2 frequent crashes under memory + IO pressure
Date: Sat, 25 Jun 2016 17:50:06 +0200 [thread overview]
Message-ID: <20160625155006.GA4166@sig21.net> (raw)
In-Reply-To: <201606232026.GFJ26539.QVtFFOJOOLHFMS@I-love.SAKURA.ne.jp>
On Thu, Jun 23, 2016 at 08:26:35PM +0900, Tetsuo Handa wrote:
>
> Since you think you saw OOM messages with the older kernels, I assume that the OOM
> killer was invoked on your 4.6.2 kernel. The OOM reaper in Linux 4.6 and Linux 4.7
> will not help if the OOM killed process was between down_write(&mm->mmap_sem) and
> up_write(&mm->mmap_sem).
>
> I was not able to confirm whether the OOM killed process (I guess it was java)
> was holding mm->mmap_sem for write, for /proc/sys/kernel/hung_task_warnings
> dropped to 0 before traces of java threads are printed or console became
> unusable due to the "delayed: kcryptd_crypt, ..." line. Anyway, I think that
> kmallocwd will report it.
>
> > > It is sad that we haven't merged kmallocwd which will report
> > > which memory allocations are stalling
> > > ( http://lkml.kernel.org/r/1462630604-23410-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp ).
> >
> > Would you like me to try it? It wouldn't prevent the hang, though,
> > just print better debug ouptut to serial console, right?
> > Or would it OOM kill some process?
>
> Yes, but for bisection purpose, please try commit 78ebc2f7146156f4 without
> applying kmallocwd. If that commit helps avoiding flood of the allocation
> failure warnings, we can consider backporting it. If that commit does not
> help, I think you are reporting a new location which we should not use
> memory reserves.
>
> kmallocwd will not OOM kill some process. kmallocwd will not prevent the hang.
> kmallocwd just prints information of threads which are stalling inside memory
> allocation request.
First I tried today's git, linux-4.7-rc4-187-g086e3eb, and
the good news is that the oom killer seems to work very
well and reliably killed the offending task (java).
It happened a few times, the AOSP build broke and I restarted
it until it completed. E.g.:
[ 2083.604374] Purging GPU memory, 0 pages freed, 4508 pages still pinned.
[ 2083.611000] 96 and 0 pages still available in the bound and unbound GPU page lists.
[ 2083.618815] make invoked oom-killer: gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=2, oom_score_adj=0
[ 2083.629257] make cpuset=/ mems_allowed=0
...
[ 2084.688753] Out of memory: Kill process 10431 (java) score 378 or sacrifice child
[ 2084.696593] Killed process 10431 (java) total-vm:5200964kB, anon-rss:2521764kB, file-rss:0kB, shmem-rss:0kB
[ 2084.938058] oom_reaper: reaped process 10431 (java), now anon-rss:0kB, file-rss:8kB, shmem-rss:0kB
Next I tried 4.6.2 with 78ebc2f7146156f4, then with kmallocwd (needed one manual fixup),
then both patches. It still livelocked in all cases, the log spew looked
a bit different with 78ebc2f7146156f4 applied but still continued
endlessly. kmallocwd alone didn't trigger, with both patches
applied kmallocwd triggered but:
[ 363.815595] MemAlloc-Info: stalling=33 dying=0 exiting=42 victim=0 oom_count=0
[ 363.815601] MemAlloc: kworker/0:0(4) flags=0x4208860 switches=212 seq=1 gfp=0x26012c0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_NOTRACK) order=0 delay=17984
** 1402 printk messages dropped ** [ 363.818816] [<ffffffff8116d519>] __do_page_cache_readahead+0x144/0x29d
** 501 printk messages dropped **
I'll zip up the logs and send them off-list.
Thanks,
Johannes
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-06-25 15:50 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-16 21:26 4.6.2 frequent crashes under memory + IO pressure Johannes Stezenbach
2016-06-21 11:47 ` Tetsuo Handa
2016-06-23 9:18 ` Johannes Stezenbach
2016-06-23 11:26 ` Tetsuo Handa
2016-06-25 15:50 ` Johannes Stezenbach [this message]
2016-06-25 17:04 ` Tetsuo Handa
2016-06-25 17:29 ` Johannes Stezenbach
2016-06-26 9:00 ` Tetsuo Handa
[not found] ` <20160626150958.GA3780@sig21.net>
[not found] ` <201606270135.CGD13081.LHFtFVQOSOMOJF@I-love.SAKURA.ne.jp>
2016-06-26 19:40 ` Johannes Stezenbach
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160625155006.GA4166@sig21.net \
--to=js@sig21.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).