linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Mina Almasry <almasrymina@google.com>
Cc: Theodore Ts'o <tytso@mit.edu>, Greg Thelen <gthelen@google.com>,
	Shakeel Butt <shakeelb@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>, Roman Gushchin <guro@fb.com>,
	Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Muchun Song <songmuchun@bytedance.com>,
	riel@surriel.com, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org
Subject: Re: [PATCH v3 2/4] mm/oom: handle remote ooms
Date: Mon, 15 Nov 2021 11:58:45 +0100	[thread overview]
Message-ID: <YZI9ZbRVdRtE2m70@dhcp22.suse.cz> (raw)
In-Reply-To: <CAHS8izMjfwgiNEoJWGSub6iqgPKyyoMZK5ONrMV2=MeMJsM5sg@mail.gmail.com>

On Fri 12-11-21 09:59:22, Mina Almasry wrote:
> On Fri, Nov 12, 2021 at 12:36 AM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Fri 12-11-21 00:12:52, Mina Almasry wrote:
> > > On Thu, Nov 11, 2021 at 11:52 PM Michal Hocko <mhocko@suse.com> wrote:
> > > >
> > > > On Thu 11-11-21 15:42:01, Mina Almasry wrote:
> > > > > On remote ooms (OOMs due to remote charging), the oom-killer will attempt
> > > > > to find a task to kill in the memcg under oom, if the oom-killer
> > > > > is unable to find one, the oom-killer should simply return ENOMEM to the
> > > > > allocating process.
> > > >
> > > > This really begs for some justification.
> > > >
> > >
> > > I'm thinking (and I can add to the commit message in v4) that we have
> > > 2 reasonable options when the oom-killer gets invoked and finds
> > > nothing to kill: (1) return ENOMEM, (2) kill the allocating task. I'm
> > > thinking returning ENOMEM allows the application to gracefully handle
> > > the failure to remote charge and continue operation.
> > >
> > > For example, in the network service use case that I mentioned in the
> > > RFC proposal, it's beneficial for the network service to get an ENOMEM
> > > and continue to service network requests for other clients running on
> > > the machine, rather than get oom-killed when hitting the remote memcg
> > > limit. But, this is not a hard requirement, the network service could
> > > fork a process that does the remote charging to guard against the
> > > remote charge bringing down the entire process.
> >
> > This all belongs to the changelog so that we can discuss all potential
> > implication and do not rely on any implicit assumptions.
> 
> Understood. Maybe I'll wait to collect more feedback and upload v4
> with a thorough explanation of the thought process.
> 
> > E.g. why does
> > it even make sense to kill a task in the origin cgroup?
> >
> 
> The behavior I saw returning ENOMEM for this edge case was that the
> code was forever looping the pagefault, and I was (seemingly
> incorrectly) under the impression that a suggestion to forever loop
> the pagefault would be completely fundamentally unacceptable.

Well, I have to say I am not entirely sure what is the best way to
handle this situation. Another option would be to treat this similar to
ENOSPACE situation. This would result into SIGBUS IIRC.

The main problem with OOM killer is that it will not resolve the
underlying problem in most situations. Shmem files would likely stay
laying around and their charge along with them. Killing the allocating
task has problems on its own because this could be just a DoS vector by
other unrelated tasks sharing the shmem mount point without a gracefull
fallback. Retrying the page fault is hard to detect. SIGBUS might be
something that helps with the latest. The question is how to communicate
this requerement down to the memcg code to know that the memory reclaim
should happen (Should it? How hard we should try?) but do not invoke the
oom killer. The more I think about this the nastier this is.
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2021-11-15 11:00 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20211111234203.1824138-1-almasrymina@google.com>
2021-11-11 23:42 ` [PATCH v3 1/4] mm/shmem: support deterministic charging of tmpfs Mina Almasry
2021-11-11 23:42 ` [PATCH v3 2/4] mm/oom: handle remote ooms Mina Almasry
2021-11-12  7:51   ` Michal Hocko
2021-11-12  8:12     ` Mina Almasry
2021-11-12  8:36       ` Michal Hocko
2021-11-12 17:59         ` Mina Almasry
2021-11-15 10:58           ` Michal Hocko [this message]
2021-11-15 17:32             ` Shakeel Butt
2021-11-16  0:58             ` Mina Almasry
2021-11-16  9:28               ` Michal Hocko
2021-11-16  9:39                 ` Michal Hocko
2021-11-16 10:17                 ` Mina Almasry
2021-11-16 11:29                   ` Michal Hocko
2021-11-16 21:27                     ` Mina Almasry
2021-11-16 21:55                       ` Shakeel Butt
2021-11-18  8:48                         ` Michal Hocko
2021-11-19 22:32                           ` Mina Almasry
2021-11-18  8:47                       ` Michal Hocko
2021-11-11 23:42 ` [PATCH v3 3/4] mm, shmem: add tmpfs memcg= option documentation Mina Almasry
2021-11-11 23:42 ` [PATCH v3 4/4] mm, shmem, selftests: add tmpfs memcg= mount option tests Mina Almasry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YZI9ZbRVdRtE2m70@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=gthelen@google.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@surriel.com \
    --cc=shakeelb@google.com \
    --cc=songmuchun@bytedance.com \
    --cc=tj@kernel.org \
    --cc=tytso@mit.edu \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).