From: Petr Mladek <pmladek@suse.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: mhocko@suse.com, linux-mm@kvack.org, sergey.senozhatsky@gmail.com
Subject: Re: [PATCH] mm/page_alloc: Wait for oom_lock before retrying.
Date: Wed, 14 Dec 2016 10:37:06 +0100 [thread overview]
Message-ID: <20161214093706.GA16064@pathway.suse.cz> (raw)
In-Reply-To: <201612132106.IJH12421.LJStOQMVHFOFOF@I-love.SAKURA.ne.jp>
On Tue 2016-12-13 21:06:57, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Mon 12-12-16 13:55:35, Michal Hocko wrote:
> > > On Mon 12-12-16 21:12:06, Tetsuo Handa wrote:
> > > > Michal Hocko wrote:
> > [...]
> > > > > > I think this warn_alloc() is too much noise. When something went
> > > > > > wrong, multiple instances of Thread-2 tend to call warn_alloc()
> > > > > > concurrently. We don't need to report similar memory information.
> > > > >
> > > > > That is why we have ratelimitting. It is needs a better tunning then
> > > > > just let's do it.
> > > >
> > > > I think that calling show_mem() once per a series of warn_alloc() threads is
> > > > sufficient. Since the amount of output by dump_stack() and that by show_mem()
> > > > are nearly equals, we can save nearly 50% of output if we manage to avoid
> > > > the same show_mem() calls.
> > >
> > > I do not mind such an update. Again, that is what we have the
> > > ratelimitting for. The fact that it doesn't throttle properly means that
> > > we should tune its parameters.
> >
> > What about the following? Does this help?
>
> I don't think it made much difference.
>
> I noticed that one of triggers which cause a lot of
> "** XXX printk messages dropped **" is show_all_locks() added by
> commit b2d4c2edb2e4f89a ("locking/hung_task: Show all locks"). When there are
> a lot of threads being blocked on fs locks, show_all_locks() on each blocked
> thread generates incredible amount of messages periodically. Therefore,
> I temporarily set /proc/sys/kernel/hung_task_timeout_secs to 0 to disable
> hung task warnings for testing this patch.
>
> http://I-love.SAKURA.ne.jp/tmp/serial-20161213.txt.xz is a console log with
> this patch applied. Due to hung task warnings disabled, amount of messages
> are significantly reduced.
>
> Uptime > 400 are testcases where the stresser was invoked via "taskset -c 0".
> Since there are some "** XXX printk messages dropped **" messages, I can't
> tell whether the OOM killer was able to make forward progress. But guessing
> from the result that there is no corresponding "Killed process" line for
> "Out of memory: " line at uptime = 450 and the duration of PID 14622 stalled,
> I think it is OK to say that the system got stuck because the OOM killer was
> not able to make forward progress.
I am afraid that as long as you see "** XXX printk messages dropped
**" then there is something that is able to keep warn_alloc() busy,
never leave the printk()/console_unlock() and and block OOM killer
progress.
> ----------
> [ 450.767693] Out of memory: Kill process 14642 (a.out) score 999 or sacrifice child
> [ 450.769974] Killed process 14642 (a.out) total-vm:4168kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB
> [ 450.776538] oom_reaper: reaped process 14642 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> [ 450.781170] Out of memory: Kill process 14643 (a.out) score 999 or sacrifice child
> [ 450.783469] Killed process 14643 (a.out) total-vm:4168kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB
> [ 450.787912] oom_reaper: reaped process 14643 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> [ 450.792630] Out of memory: Kill process 14644 (a.out) score 999 or sacrifice child
> [ 450.964031] a.out: page allocation stalls for 10014ms, order:0, mode:0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO)
> [ 450.964033] CPU: 0 PID: 14622 Comm: a.out Tainted: G W 4.9.0+ #99
> (...snipped...)
> [ 740.984902] a.out: page allocation stalls for 300003ms, order:0, mode:0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO)
> [ 740.984905] CPU: 0 PID: 14622 Comm: a.out Tainted: G W 4.9.0+ #99
> ----------
>
> Although it is fine to make warn_alloc() less verbose, this is not
> a problem which can be avoided by simply reducing printk(). Unless
> we give enough CPU time to the OOM killer and OOM victims, it is
> trivial to lockup the system.
You could try to use printk_deferred() in warn_alloc(). It will not
handle console. It will help to be sure that the blocked printk()
is the main problem.
Best Regards,
Petr
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-12-14 9:37 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-06 10:33 [PATCH] mm/page_alloc: Wait for oom_lock before retrying Tetsuo Handa
2016-12-07 8:15 ` Michal Hocko
2016-12-07 15:29 ` Tetsuo Handa
2016-12-08 8:20 ` Vlastimil Babka
2016-12-08 11:00 ` Tetsuo Handa
2016-12-08 13:32 ` Michal Hocko
2016-12-08 16:18 ` Sergey Senozhatsky
2016-12-08 13:27 ` Michal Hocko
2016-12-09 14:23 ` Tetsuo Handa
2016-12-09 14:46 ` Michal Hocko
2016-12-10 11:24 ` Tetsuo Handa
2016-12-12 9:07 ` Michal Hocko
2016-12-12 11:49 ` Petr Mladek
2016-12-12 13:00 ` Michal Hocko
2016-12-12 14:05 ` Tetsuo Handa
2016-12-13 1:06 ` Sergey Senozhatsky
2016-12-12 12:12 ` Tetsuo Handa
2016-12-12 12:55 ` Michal Hocko
2016-12-12 13:19 ` Michal Hocko
2016-12-13 12:06 ` Tetsuo Handa
2016-12-13 17:06 ` Michal Hocko
2016-12-14 11:37 ` Tetsuo Handa
2016-12-14 12:42 ` Michal Hocko
2016-12-14 16:36 ` Tetsuo Handa
2016-12-14 18:18 ` Michal Hocko
2016-12-15 10:21 ` Tetsuo Handa
2016-12-19 11:25 ` Tetsuo Handa
2016-12-19 12:27 ` Sergey Senozhatsky
2016-12-20 15:39 ` Sergey Senozhatsky
2016-12-22 10:27 ` Tetsuo Handa
2016-12-22 10:53 ` Petr Mladek
2016-12-22 13:40 ` Sergey Senozhatsky
2016-12-22 13:33 ` Tetsuo Handa
2016-12-22 19:24 ` Michal Hocko
2016-12-24 6:25 ` Tetsuo Handa
2016-12-26 11:49 ` Michal Hocko
2016-12-27 10:39 ` Tetsuo Handa
2016-12-27 10:57 ` Michal Hocko
2016-12-22 13:42 ` Sergey Senozhatsky
2016-12-22 14:01 ` Tetsuo Handa
2016-12-22 14:09 ` Sergey Senozhatsky
2016-12-22 14:30 ` Sergey Senozhatsky
2016-12-26 10:54 ` Tetsuo Handa
2016-12-26 11:34 ` Sergey Senozhatsky
2017-01-12 13:10 ` Petr Mladek
2017-01-13 2:52 ` Sergey Senozhatsky
2017-01-13 3:53 ` Sergey Senozhatsky
2017-01-13 11:15 ` Petr Mladek
2017-01-13 11:14 ` Petr Mladek
2017-01-12 14:18 ` Petr Mladek
2017-01-13 2:28 ` Sergey Senozhatsky
2017-01-13 11:03 ` Petr Mladek
2017-01-13 11:50 ` Sergey Senozhatsky
2017-01-13 12:15 ` Petr Mladek
2016-12-26 11:41 ` Sergey Senozhatsky
2017-01-13 14:03 ` Petr Mladek
2016-12-15 1:11 ` Sergey Senozhatsky
2016-12-15 6:35 ` Michal Hocko
2016-12-15 10:16 ` Petr Mladek
2016-12-14 9:37 ` Petr Mladek [this message]
2016-12-14 10:20 ` Sergey Senozhatsky
2016-12-14 11:01 ` Petr Mladek
2016-12-14 12:23 ` Sergey Senozhatsky
2016-12-14 12:47 ` Petr Mladek
2016-12-14 10:26 ` Michal Hocko
2016-12-15 7:34 ` Sergey Senozhatsky
2016-12-14 11:37 ` Tetsuo Handa
2016-12-14 12:36 ` Petr Mladek
2016-12-14 12:44 ` Michal Hocko
2016-12-14 13:36 ` Tetsuo Handa
2016-12-14 13:52 ` Michal Hocko
2016-12-14 12:50 ` Sergey Senozhatsky
2016-12-12 14:59 ` Tetsuo Handa
2016-12-12 15:55 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161214093706.GA16064@pathway.suse.cz \
--to=pmladek@suse.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=sergey.senozhatsky@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).