From: Michal Hocko <mhocko@kernel.org>
To: Eric Anholt <eric@anholt.net>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org,
Christian.Koenig@amd.com
Subject: Re: [RFC] Per file OOM badness
Date: Fri, 19 Jan 2018 09:20:46 +0100 [thread overview]
Message-ID: <20180119082046.GL6584@dhcp22.suse.cz> (raw)
In-Reply-To: <87k1wfgcmb.fsf@anholt.net>
On Thu 18-01-18 12:01:32, Eric Anholt wrote:
> Michal Hocko <mhocko@kernel.org> writes:
>
> > On Thu 18-01-18 18:00:06, Michal Hocko wrote:
> >> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote:
> >> > Hi, this series is a revised version of an RFC sent by Christian Konig
> >> > a few years ago. The original RFC can be found at
> >> > https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> >> >
> >> > This is the same idea and I've just adressed his concern from the original RFC
> >> > and switched to a callback into file_ops instead of a new member in struct file.
> >>
> >> Please add the full description to the cover letter and do not make
> >> people hunt links.
> >>
> >> Here is the origin cover letter text
> >> : I'm currently working on the issue that when device drivers allocate memory on
> >> : behalf of an application the OOM killer usually doesn't knew about that unless
> >> : the application also get this memory mapped into their address space.
> >> :
> >> : This is especially annoying for graphics drivers where a lot of the VRAM
> >> : usually isn't CPU accessible and so doesn't make sense to map into the
> >> : address space of the process using it.
> >> :
> >> : The problem now is that when an application starts to use a lot of VRAM those
> >> : buffers objects sooner or later get swapped out to system memory, but when we
> >> : now run into an out of memory situation the OOM killer obviously doesn't knew
> >> : anything about that memory and so usually kills the wrong process.
> >
> > OK, but how do you attribute that memory to a particular OOM killable
> > entity? And how do you actually enforce that those resources get freed
> > on the oom killer action?
> >
> >> : The following set of patches tries to address this problem by introducing a per
> >> : file OOM badness score, which device drivers can use to give the OOM killer a
> >> : hint how many resources are bound to a file descriptor so that it can make
> >> : better decisions which process to kill.
> >
> > But files are not killable, they can be shared... In other words this
> > doesn't help the oom killer to make an educated guess at all.
>
> Maybe some more context would help the discussion?
>
> The struct file in patch 3 is the DRM fd. That's effectively "my
> process's interface to talking to the GPU" not "a single GPU resource".
> Once that file is closed, all of the process's private, idle GPU buffers
> will be immediately freed (this will be most of their allocations), and
> some will be freed once the GPU completes some work (this will be most
> of the rest of their allocations).
>
> Some GEM BOs won't be freed just by closing the fd, if they've been
> shared between processes. Those are usually about 8-24MB total in a
> process, rather than the GBs that modern apps use (or that our testcases
> like to allocate and thus trigger oomkilling of the test harness instead
> of the offending testcase...)
>
> Even if we just had the private+idle buffers being accounted in OOM
> badness, that would be a huge step forward in system reliability.
OK, in that case I would propose a different approach. We already
have rss_stat. So why do not we simply add a new counter there
MM_KERNELPAGES and consider those in oom_badness? The rule would be
that such a memory is bound to the process life time. I guess we will
find more users for this later.
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2018-01-19 8:20 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-18 16:47 [RFC] Per file OOM badness Andrey Grodzovsky
2018-01-18 16:47 ` [PATCH 1/4] fs: add OOM badness callback in file_operatrations struct Andrey Grodzovsky
2018-01-18 16:47 ` [PATCH 2/4] oom: take per file badness into account Andrey Grodzovsky
2018-01-18 16:47 ` [PATCH 3/4] drm/gem: adjust per file OOM badness on handling buffers Andrey Grodzovsky
2018-01-19 6:01 ` Chunming Zhou
2018-01-18 16:47 ` [PATCH 4/4] drm/amdgpu: Use drm_oom_badness for amdgpu Andrey Grodzovsky
2018-01-30 9:24 ` Daniel Vetter
2018-01-30 12:42 ` Andrey Grodzovsky
2018-01-18 17:00 ` [RFC] Per file OOM badness Michal Hocko
2018-01-18 17:13 ` Michal Hocko
2018-01-18 20:01 ` Eric Anholt
2018-01-19 8:20 ` Michal Hocko [this message]
2018-01-19 8:39 ` Christian König
2018-01-19 9:32 ` Michel Dänzer
2018-01-19 9:58 ` Christian König
2018-01-19 10:02 ` Michel Dänzer
2018-01-19 15:07 ` Michel Dänzer
2018-01-21 6:50 ` Eric Anholt
2018-01-19 10:40 ` Michal Hocko
2018-01-19 11:37 ` Christian König
2018-01-19 12:13 ` Michal Hocko
2018-01-19 12:20 ` Michal Hocko
2018-01-19 16:54 ` Christian König
2018-01-23 11:39 ` Michal Hocko
2018-01-19 16:48 ` Michel Dänzer
2018-01-19 8:35 ` Christian König
2018-01-19 6:01 ` He, Roger
2018-01-19 8:25 ` Michal Hocko
2018-01-19 10:02 ` roger
2018-01-23 15:27 ` Roman Gushchin
2018-01-23 15:36 ` Michal Hocko
2018-01-23 16:39 ` Michel Dänzer
2018-01-24 9:28 ` Michal Hocko
2018-01-24 10:27 ` Michel Dänzer
2018-01-24 11:01 ` Michal Hocko
2018-01-24 11:23 ` Michel Dänzer
2018-01-24 11:50 ` Michal Hocko
2018-01-24 12:11 ` Christian König
2018-01-30 9:31 ` Daniel Vetter
2018-01-30 9:43 ` Michel Dänzer
2018-01-30 10:40 ` Christian König
2018-01-30 11:02 ` Michel Dänzer
2018-01-30 11:28 ` Christian König
2018-01-30 11:34 ` Michel Dänzer
2018-01-30 11:36 ` Nicolai Hähnle
2018-01-30 11:42 ` Michel Dänzer
2018-01-30 11:56 ` Christian König
2018-01-30 15:52 ` Michel Dänzer
2018-01-30 10:42 ` Daniel Vetter
2018-01-30 10:48 ` Michel Dänzer
2018-01-30 11:35 ` Nicolai Hähnle
2018-01-24 14:31 ` Michel Dänzer
2018-01-30 9:29 ` Michel Dänzer
2018-01-30 10:28 ` Michal Hocko
2018-03-26 14:36 ` Lucas Stach
2018-04-04 9:09 ` Michel Dänzer
2018-04-04 9:36 ` Lucas Stach
2018-04-04 9:46 ` Michel Dänzer
2018-01-19 5:39 ` He, Roger
2018-01-19 8:17 ` Christian König
2018-01-22 23:23 ` Andrew Morton
2018-01-23 1:59 ` Andrey Grodzovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180119082046.GL6584@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=Christian.Koenig@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=andrey.grodzovsky@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=eric@anholt.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).