From: Michal Hocko <mhocko@kernel.org>
To: lsf-pc@lists.linux-foundation.org
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org
Subject: [LSF/MM TOPIC] proposals for topics
Date: Mon, 25 Jan 2016 14:33:57 +0100 [thread overview]
Message-ID: <20160125133357.GC23939@dhcp22.suse.cz> (raw)
Hi,
I would like to propose the following topics (mainly for the MM track
but some of them might be of interest for FS people as well)
- gfp flags for allocations requests seems to be quite complicated
and used arbitrarily by many subsystems. GFP_REPEAT is one such
example. Half of the current usage is for low order allocations
requests where it is basically ignored. Moreover the documentation
claims that such a request is _not_ retrying endlessly which is
true only for costly high order allocations. I think we should get
rid of most of the users of this flag (basically all low order ones)
and then come up with something like GFP_BEST_EFFORT which would work
for all orders consistently [1]
- GFP_NOFS is another one which would be good to discuss. Its primary
use is to prevent from reclaim recursion back into FS. This makes
such an allocation context weaker and historically we haven't
triggered OOM killer and rather hopelessly retry the request and
rely on somebody else to make a progress for us. There are two issues
here.
First we shouldn't retry endlessly and rather fail the allocation and
allow the FS to handle the error. As per my experiments most FS cope
with that quite reasonably. Btrfs unfortunately handles many of those
failures by BUG_ON which is really unfortunate.
Another issue is that GFP_NOFS is quite often used without any obvious
reason. It is not clear which lock is held and could be taken from
the reclaim path. Wouldn't it be much better if the no-recursion
behavior was bound to the lock scope rather than particular allocation
request? We already have something like this for PM
pm_res{trict,tore}_gfp_mask resp. memalloc_noio_{save,restore}. It
would be great if we could unify this and use the context based NOFS
in the FS.
- OOM killer has been discussed a lot throughout this year. We have
discussed this topic the last year at LSF and there has been quite some
progress since then. We have async memory tear down for the OOM victim
[2] which should help in many corner cases. We are still waiting
to make mmap_sem for write killable which would help in some other
classes of corner cases. Whatever we do, however, will not work in
100% cases. So the primary question is how far are we willing to go to
support different corner cases. Do we want to have a
panic_after_timeout global knob, allow multiple OOM victims after
a timeout?
- sysrq+f to trigger the oom killer follows some heuristics used by the
OOM killer invoked by the system which means that it is unreliable
and it might skip to kill any task without any explanation why. The
semantic of the knob doesn't seem to clear and it has been even
suggested [3] to remove it altogether as an unuseful debugging aid. Is
this really a general consensus?
- One of the long lasting issue related to the OOM handling is when to
actually declare OOM. There are workloads which might be trashing on
few last remaining pagecache pages or on the swap which makes the
system completely unusable for considerable amount of time yet the
OOM killer is not invoked. Can we finally do something about that?
[1] http://lkml.kernel.org/r/1446740160-29094-1-git-send-email-mhocko@kernel.org
[2] http://lkml.kernel.org/r/1452094975-551-1-git-send-email-mhocko@kernel.org
[3] http://lkml.kernel.org/r/alpine.DEB.2.10.1601141347220.16227@chino.kir.corp.google.com
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2016-01-25 13:33 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-25 13:33 Michal Hocko [this message]
2016-01-25 14:21 ` [Lsf-pc] [LSF/MM TOPIC] proposals for topics Jan Kara
2016-01-25 14:40 ` Michal Hocko
2016-01-25 15:08 ` Tetsuo Handa
2016-01-26 9:43 ` Michal Hocko
2016-01-27 13:44 ` Tetsuo Handa
2016-01-27 14:33 ` [Lsf-pc] " Jan Kara
2016-01-25 18:45 ` Johannes Weiner
2016-01-26 9:50 ` Michal Hocko
2016-01-26 17:17 ` Vlastimil Babka
2016-01-26 17:20 ` [Lsf-pc] " Jan Kara
2016-01-27 9:08 ` Michal Hocko
2016-01-28 20:55 ` Dave Chinner
2016-01-28 22:04 ` Michal Hocko
2016-01-31 23:29 ` Dave Chinner
2016-02-01 12:24 ` Vlastimil Babka
2016-01-26 17:07 ` Vlastimil Babka
2016-01-26 18:09 ` Johannes Weiner
2016-01-30 18:18 ` Greg Thelen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160125133357.GC23939@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).