linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Stezenbach <js@sig21.net>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, mhocko@kernel.org
Subject: Re: 4.6.2 frequent crashes under memory + IO pressure
Date: Sat, 25 Jun 2016 17:50:06 +0200	[thread overview]
Message-ID: <20160625155006.GA4166@sig21.net> (raw)
In-Reply-To: <201606232026.GFJ26539.QVtFFOJOOLHFMS@I-love.SAKURA.ne.jp>

On Thu, Jun 23, 2016 at 08:26:35PM +0900, Tetsuo Handa wrote:
> 
> Since you think you saw OOM messages with the older kernels, I assume that the OOM
> killer was invoked on your 4.6.2 kernel. The OOM reaper in Linux 4.6 and Linux 4.7
> will not help if the OOM killed process was between down_write(&mm->mmap_sem) and
> up_write(&mm->mmap_sem).
> 
> I was not able to confirm whether the OOM killed process (I guess it was java)
> was holding mm->mmap_sem for write, for /proc/sys/kernel/hung_task_warnings
> dropped to 0 before traces of java threads are printed or console became
> unusable due to the "delayed: kcryptd_crypt, ..." line. Anyway, I think that
> kmallocwd will report it.
> 
> > > It is sad that we haven't merged kmallocwd which will report
> > > which memory allocations are stalling
> > >  ( http://lkml.kernel.org/r/1462630604-23410-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp ).
> > 
> > Would you like me to try it?  It wouldn't prevent the hang, though,
> > just print better debug ouptut to serial console, right?
> > Or would it OOM kill some process?
> 
> Yes, but for bisection purpose, please try commit 78ebc2f7146156f4 without
> applying kmallocwd. If that commit helps avoiding flood of the allocation
> failure warnings, we can consider backporting it. If that commit does not
> help, I think you are reporting a new location which we should not use
> memory reserves.
> 
> kmallocwd will not OOM kill some process. kmallocwd will not prevent the hang.
> kmallocwd just prints information of threads which are stalling inside memory
> allocation request.

First I tried today's git, linux-4.7-rc4-187-g086e3eb, and
the good news is that the oom killer seems to work very
well and reliably killed the offending task (java).
It happened a few times, the AOSP build broke and I restarted
it until it completed.  E.g.:

[ 2083.604374] Purging GPU memory, 0 pages freed, 4508 pages still pinned.
[ 2083.611000] 96 and 0 pages still available in the bound and unbound GPU page lists.
[ 2083.618815] make invoked oom-killer: gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=2, oom_score_adj=0
[ 2083.629257] make cpuset=/ mems_allowed=0
...
[ 2084.688753] Out of memory: Kill process 10431 (java) score 378 or sacrifice child
[ 2084.696593] Killed process 10431 (java) total-vm:5200964kB, anon-rss:2521764kB, file-rss:0kB, shmem-rss:0kB
[ 2084.938058] oom_reaper: reaped process 10431 (java), now anon-rss:0kB, file-rss:8kB, shmem-rss:0kB

Next I tried 4.6.2 with 78ebc2f7146156f4, then with kmallocwd (needed one manual fixup),
then both patches.  It still livelocked in all cases, the log spew looked
a bit different with 78ebc2f7146156f4 applied but still continued
endlessly.  kmallocwd alone didn't trigger, with both patches
applied kmallocwd triggered but:

[  363.815595] MemAlloc-Info: stalling=33 dying=0 exiting=42 victim=0 oom_count=0
[  363.815601] MemAlloc: kworker/0:0(4) flags=0x4208860 switches=212 seq=1 gfp=0x26012c0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_NOTRACK) order=0 delay=17984
** 1402 printk messages dropped ** [  363.818816]  [<ffffffff8116d519>] __do_page_cache_readahead+0x144/0x29d
** 501 printk messages dropped **

I'll zip up the logs and send them off-list.


Thanks,
Johannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-06-25 15:50 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-16 21:26 4.6.2 frequent crashes under memory + IO pressure Johannes Stezenbach
2016-06-21 11:47 ` Tetsuo Handa
2016-06-23  9:18   ` Johannes Stezenbach
2016-06-23 11:26     ` Tetsuo Handa
2016-06-25 15:50       ` Johannes Stezenbach [this message]
2016-06-25 17:04         ` Tetsuo Handa
2016-06-25 17:29           ` Johannes Stezenbach
2016-06-26  9:00             ` Tetsuo Handa
     [not found]               ` <20160626150958.GA3780@sig21.net>
     [not found]                 ` <201606270135.CGD13081.LHFtFVQOSOMOJF@I-love.SAKURA.ne.jp>
2016-06-26 19:40                   ` Johannes Stezenbach

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160625155006.GA4166@sig21.net \
    --to=js@sig21.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).