From: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>
To: Vasily Averin <vvs-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
Vladimir Davydov
<vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>,
Uladzislau Rezki <urezki-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org>,
Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Mel Gorman
<mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt@public.gmane.org>,
Tetsuo Handa
<penguin-kernel-1yMVhJb1mP/7nzcFbJAaVXf5DAMn2ifp@public.gmane.org>,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
kernel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org
Subject: Re: [PATCH memcg 3/3] memcg: handle memcg oom failures
Date: Thu, 21 Oct 2021 18:47:29 +0200 [thread overview]
Message-ID: <YXGZoVhROdFG2Wym@dhcp22.suse.cz> (raw)
In-Reply-To: <b618ac5c-e982-c4af-ecf3-564b8de52c8c-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
On Thu 21-10-21 18:05:28, Vasily Averin wrote:
> On 21.10.2021 14:49, Michal Hocko wrote:
> > I do understand that handling a very specific case sounds easier but it
> > would be better to have a robust fix even if that requires some more
> > head scratching. So far we have collected several reasons why the it is
> > bad to trigger oom killer from the #PF path. There is no single argument
> > to keep it so it sounds like a viable path to pursue. Maybe there are
> > some very well hidden reasons but those should be documented and this is
> > a great opportunity to do either of the step.
> >
> > Moreover if it turns out that there is a regression then this can be
> > easily reverted and a different, maybe memcg specific, solution can be
> > implemented.
>
> Now I'm agree,
> however I still have a few open questions.
>
> 1) VM_FAULT_OOM may be triggered w/o execution of out_of_memory()
> for exampel it can be caused by incorrect vm fault operations,
> (a) which can return this error without calling allocator at all.
I would argue this to be a bug. How can that particular code tell
whether the system is OOM and the oom killer is the a reasonable measure
to take?
> (b) or which can provide incorrect gfp flags and allocator can fail without execution of out_of_memory.
I am not sure I can see any sensible scenario where pagefault oom killer
would be an appropriate fix for that.
> (c) This may happen on stable/LTS kernels when successful allocation was failed by hit into limit of legacy memcg-kmem contoller.
> We'll drop it in upstream kernels, however how to handle it in old kenrels?
Triggering the global oom killer for legacy kmem charge failure is
clearly wrong. Removing oom killer from #PF would fix that problem.
> We can make sure that out_of_memory or alocator was called by set of some per-task flags.
I am not sure I see how that would be useful other than reporting a
dubious VM_FAULT_OOM usage. I am also not sure how that would be
implemented as allocator can be called several times not to mention that
the allocation itself could have been done from a different context -
e.g. WQ.
> Can pagefault_out_of_memory() send itself a SIGKILL in all these cases?
In principle it can as sending signal is not prohibited. I would argue
it should not though because it is just wrong thing to do in all those
cases.
> If not -- task will be looped.
Yes, but it will be killable from userspace. So this is not an
unrecoverable situation.
> It is much better than execution of global OOM, however it would be even better to avoid it somehow.
How?
> You said: "We cannot really kill the task if we could we would have done it by the oom killer already".
> However what to do if we even not tried to use oom-killer? (see (b) and (c))
> or if we did not used the allocator at all (see (a))
See above
> 2) in your patch we just exit from pagefault_out_of_memory(). and restart new #PF.
> We can call schedule_timeout() and wait some time before a new #PF restart.
> Additionally we can increase this delay in each new cycle.
> It helps to save CPU time for other tasks.
> What do you think about?
I do not have a strong opinion on this. A short sleep makes sense. I am
not sure a more complex implementation is really needed.
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2021-10-21 16:47 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-18 8:13 [PATCH memcg 0/1] false global OOM triggered by memcg-limited task Vasily Averin
[not found] ` <9d10df01-0127-fb40-81c3-cc53c9733c3e-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-18 9:04 ` Michal Hocko
[not found] ` <YW04jWSv6pQb2Goe-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-18 10:05 ` Vasily Averin
[not found] ` <6b751abe-aa52-d1d8-2631-ec471975cc3a-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-18 10:12 ` Vasily Averin
2021-10-18 11:53 ` Michal Hocko
[not found] ` <27dc0c49-a0d6-875b-49c6-0ef5c0cc3ac8@virtuozzo.com>
[not found] ` <27dc0c49-a0d6-875b-49c6-0ef5c0cc3ac8-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-18 12:27 ` Michal Hocko
[not found] ` <YW1oMxNkUCaAimmg-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-18 15:07 ` Shakeel Butt
[not found] ` <CALvZod42uwgrg83CCKn6JgYqAQtR1RLJSuybNYjtkFo4wVgT1w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2021-10-18 16:51 ` Michal Hocko
2021-10-18 17:13 ` Shakeel Butt
2021-10-18 18:52 ` Vasily Averin
[not found] ` <153f7aa6-39ef-f064-8745-a9489e088239-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-18 19:18 ` Vasily Averin
[not found] ` <4a30aa18-e2a2-693c-8237-b75fffac9838-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-19 5:34 ` Shakeel Butt
2021-10-19 5:33 ` Shakeel Butt
[not found] ` <CALvZod5Kut63MLVfCkEW5XemqN4Jnd1iEQD_Gk0w5=fPffL8Bg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2021-10-19 6:42 ` Vasily Averin
[not found] ` <25120323-d222-cc5e-fe08-6471bce13bd6-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-19 8:47 ` Michal Hocko
[not found] ` <YW1gRz0rTkJrvc4L-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-19 6:30 ` Vasily Averin
[not found] ` <339ae4b5-6efd-8fc2-33f1-2eb3aee71cb2-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-19 8:49 ` Michal Hocko
[not found] ` <YW6GoZhFUJc1uLYr-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-19 10:30 ` Vasily Averin
[not found] ` <687bf489-f7a7-5604-25c5-0c1a09e0905b-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-19 11:54 ` Michal Hocko
[not found] ` <YW6yAeAO+TeS3OdB-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-19 12:04 ` Michal Hocko
[not found] ` <YW60Rs1mi24sJmp4-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-19 13:26 ` Vasily Averin
[not found] ` <6c422150-593f-f601-8f91-914c6c5e82f4-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-19 14:13 ` Michal Hocko
[not found] ` <YW7SfkZR/ZsabkXV-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-19 14:19 ` Michal Hocko
2021-10-19 19:09 ` Vasily Averin
[not found] ` <3c76e2d7-e545-ef34-b2c3-a5f63b1eff51-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-20 8:07 ` [PATCH memcg v4] memcg: prohibit unconditional exceeding the limit of dying tasks Vasily Averin
[not found] ` <f40cd82c-f03a-4d36-e953-f89399cb8f58-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-20 8:43 ` Michal Hocko
[not found] ` <YW/WoJDFM3ddHn7Y-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-20 12:11 ` [PATCH memcg RFC 0/3] " Vasily Averin
[not found] ` <cover.1634730787.git.vvs@virtuozzo.com>
[not found] ` <cover.1634730787.git.vvs-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-20 12:12 ` [PATCH memcg 1/3] mm: do not firce global OOM from inside " Vasily Averin
[not found] ` <2c13c739-7282-e6f4-da0a-c0b69e68581e-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-20 12:33 ` Michal Hocko
[not found] ` <YXAMpxjuV/h2awqG-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-20 13:52 ` Vasily Averin
2021-10-20 12:13 ` [PATCH memcg 2/3] memcg: remove charge forcinig for " Vasily Averin
[not found] ` <56180e53-b705-b1be-9b60-75e141c8560c-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-20 12:41 ` Michal Hocko
[not found] ` <YXAOjQO5r1g/WKmn-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-20 14:21 ` Vasily Averin
[not found] ` <cbda9b6b-3ee5-06ab-9a3b-debf361b55bb-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-20 14:57 ` Michal Hocko
[not found] ` <YXAubuMMgNDeguNx-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-20 15:20 ` Tetsuo Handa
[not found] ` <dee26724-3ead-24d4-0c1b-23905bfcdae9-1yMVhJb1mP/7nzcFbJAaVXf5DAMn2ifp@public.gmane.org>
2021-10-21 10:03 ` Michal Hocko
2021-10-20 12:14 ` [PATCH memcg 3/3] memcg: handle memcg oom failures Vasily Averin
2021-10-20 13:02 ` Michal Hocko
[not found] ` <YXATW7KsUZzbbGHy-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-20 15:46 ` Vasily Averin
[not found] ` <d3b32c72-6375-f755-7599-ab804719e1f6-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-21 11:49 ` Michal Hocko
[not found] ` <YXFPSvGFV539OcEk-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-21 15:05 ` Vasily Averin
[not found] ` <b618ac5c-e982-c4af-ecf3-564b8de52c8c-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-21 16:47 ` Michal Hocko [this message]
[not found] ` <YXGZoVhROdFG2Wym-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-22 8:10 ` [PATCH memcg v2 0/2] memcg: prohibit unconditional exceeding the limit of dying tasks Vasily Averin
[not found] ` <cover.1634889066.git.vvs@virtuozzo.com>
[not found] ` <cover.1634889066.git.vvs-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-22 8:11 ` [PATCH memcg v2 1/2] mm, oom: do not trigger out_of_memory from the #PF Vasily Averin
[not found] ` <91d9196e-842a-757f-a3f2-caeb4a89a0d8-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-22 8:55 ` Michal Hocko
2021-10-22 8:11 ` [PATCH memcg v2 2/2] memcg: prohibit unconditional exceeding the limit of dying tasks Vasily Averin
[not found] ` <4b315938-5600-b7f5-bde9-82f638a2e595-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-22 9:10 ` Michal Hocko
[not found] ` <YXJ/63kIpTq8AOlD-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-23 13:18 ` [PATCH memcg v3 0/3] " Vasily Averin
[not found] ` <cover.1634994605.git.vvs@virtuozzo.com>
[not found] ` <cover.1634994605.git.vvs-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-23 13:19 ` [PATCH memcg v3 1/3] mm, oom: pagefault_out_of_memory: don't force global OOM for " Vasily Averin
[not found] ` <0828a149-786e-7c06-b70a-52d086818ea3-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-25 9:27 ` Michal Hocko
2021-10-23 13:20 ` [PATCH memcg v3 2/3] mm, oom: do not trigger out_of_memory from the #PF Vasily Averin
[not found] ` <f5fd8dd8-0ad4-c524-5f65-920b01972a42-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-23 15:01 ` Tetsuo Handa
2021-10-23 19:15 ` Vasily Averin
[not found] ` <e2a847a2-a414-2535-e3d1-b100a023b9d1-1yMVhJb1mP/7nzcFbJAaVXf5DAMn2ifp@public.gmane.org>
2021-10-25 8:04 ` Michal Hocko
[not found] ` <YXZk9Lr217e+saSM-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-26 13:56 ` Tetsuo Handa
[not found] ` <62a326bc-37d2-b8c9-ddbf-7adaeaadf341-1yMVhJb1mP/7nzcFbJAaVXf5DAMn2ifp@public.gmane.org>
2021-10-26 14:07 ` Michal Hocko
2021-10-25 9:34 ` Michal Hocko
2021-10-23 13:20 ` [PATCH memcg v3 3/3] memcg: prohibit unconditional exceeding the limit of dying tasks Vasily Averin
[not found] ` <8f5cebbb-06da-4902-91f0-6566fc4b4203-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-25 9:36 ` Michal Hocko
[not found] ` <YXZ6qaMJBomVfV8O-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2021-10-27 22:36 ` Andrew Morton
[not found] ` <20211027153608.9910f7db99d5ef574045370e-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2021-10-28 7:22 ` Vasily Averin
[not found] ` <ea14200f-ad2c-6901-25da-54900fe2ce14-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-29 7:46 ` Greg Kroah-Hartman
2021-10-29 7:58 ` Michal Hocko
2021-10-21 8:03 ` [PATCH memcg 0/1] false global OOM triggered by memcg-limited task Vasily Averin
[not found] ` <496ed57e-61c6-023a-05fd-4ef21b0294cf-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
2021-10-21 11:49 ` Michal Hocko
2021-10-21 13:24 ` Vasily Averin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YXGZoVhROdFG2Wym@dhcp22.suse.cz \
--to=mhocko-ibi9rg/b67k@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=guro-b10kYP2dOMg@public.gmane.org \
--cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
--cc=kernel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
--cc=mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt@public.gmane.org \
--cc=penguin-kernel-1yMVhJb1mP/7nzcFbJAaVXf5DAMn2ifp@public.gmane.org \
--cc=shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=urezki-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=vbabka-AlSwsSmVLrQ@public.gmane.org \
--cc=vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=vvs-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox