From: Glauber Costa <glommer@parallels.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Suleiman Souhlal <ssouhlal@FreeBSD.org>,
cgroups@vger.kernel.org, suleiman@google.com, penberg@kernel.org,
cl@linux.com, yinghan@google.com, hughd@google.com,
gthelen@google.com, peterz@infradead.org,
dan.magenheimer@oracle.com, hannes@cmpxchg.org, mgorman@suse.de,
James.Bottomley@HansenPartnership.com, linux-mm@kvack.org,
devel@openvz.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 02/13] memcg: Kernel memory accounting infrastructure.
Date: Tue, 13 Mar 2012 14:37:30 +0400 [thread overview]
Message-ID: <4F5F236A.1070609@parallels.com> (raw)
In-Reply-To: <20120313152446.28b0d696.kamezawa.hiroyu@jp.fujitsu.com>
> After looking codes, I think we need to think
> whether independent_kmem_limit is good or not....
>
> How about adding MEMCG_KMEM_ACCOUNT flag instead of this and use only
> memcg->res/memcg->memsw rather than adding a new counter, memcg->kmem ?
>
> if MEMCG_KMEM_ACCOUNT is set -> slab is accoutned to mem->res/memsw.
> if MEMCG_KMEM_ACCOUNT is not set -> slab is never accounted.
>
> (I think On/Off switch is required..)
>
> Thanks,
> -Kame
>
This has been discussed before, I can probably find it in the archives
if you want to go back and see it.
But in a nutshell:
1) Supposing independent knob disappear (I will explain in item 2 why I
don't want it to), I don't thing a flag makes sense either. *If* we are
planning to enable/disable this, it might make more sense to put some
work on it, and allow particular slabs to be enabled/disabled by writing
to memory.kmem.slabinfo (-* would disable all, +* enable all, +kmalloc*
enable all kmalloc, etc).
Alternatively, what we could do instead, is something similar to what
ended up being done for tcp, by request of the network people: if you
never touch the limit file, don't bother with it at all, and simply does
not account. With Suleiman's lazy allocation infrastructure, that should
actually be trivial. And then again, a flag is not necessary, because
writing to the limit file does the job, and also convey the meaning well
enough.
2) For the kernel itself, we are mostly concerned that a malicious
container may pin into memory big amounts of kernel memory which is,
ultimately, unreclaimable. In particular, with overcommit allowed
scenarios, you can fill the whole physical memory (or at least a
significant part) with those objects, well beyond your softlimit
allowance, making the creation of further containers impossible.
With user memory, you can reclaim the cgroup back to its place. With
kernel memory, you can't.
In the particular example of 32-bit boxes, you can easily fill up a
large part of the available 1gb kernel memory with pinned memory and
render the whole system unresponsive.
Never allowing the kernel memory to go beyond the soft limit was one of
the proposed alternatives. However, it may force you to establish a soft
limit where one was not previously needed. Or, establish a low soft
limit when you really need a bigger one.
All that said, while reading your message, thinking a bit, the following
crossed my mind:
- We can account the slabs to memcg->res normally, and just store the
information that this is kernel memory into a percpu counter, as
I proposed recently.
- The knob goes away, and becomes implicit: if you ever write anything
to memory.kmem.limit_in_bytes, we transfer that memory to a separate
kmem res_counter, and proceed from there. We can keep accounting to
memcg->res anyway, just that kernel memory will now have a separate
limit.
- With this scheme, it may not be necessary to ever have a file
memory.kmem.soft_limit_in_bytes. Reclaim is always part of the normal
memcg reclaim.
The outlined above would work for us, and make the whole scheme simpler,
I believe.
What do you think ?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-03-13 10:37 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-09 20:39 [PATCH v2 00/13] Memcg Kernel Memory Tracking Suleiman Souhlal
2012-03-09 20:39 ` [PATCH v2 01/13] memcg: Consolidate various flags into a single flags field Suleiman Souhlal
2012-03-11 7:50 ` Glauber Costa
2012-03-09 20:39 ` [PATCH v2 02/13] memcg: Kernel memory accounting infrastructure Suleiman Souhlal
[not found] ` <1331325556-16447-3-git-send-email-ssouhlal-HZy0K5TPuP5AfugRpC6u6w@public.gmane.org>
2012-03-11 8:12 ` Glauber Costa
2012-03-13 6:24 ` KAMEZAWA Hiroyuki
2012-03-13 10:37 ` Glauber Costa [this message]
2012-03-13 17:00 ` Greg Thelen
[not found] ` <xr93d38g77w5.fsf-aSPv4SP+Du0KgorLzL7FmE7CuiCeIGUxQQ4Iyu8u01E@public.gmane.org>
2012-03-13 17:31 ` Glauber Costa
2012-03-14 0:15 ` KAMEZAWA Hiroyuki
2012-03-14 12:29 ` Glauber Costa
2012-03-15 0:48 ` KAMEZAWA Hiroyuki
[not found] ` <4F613C5B.8030304-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2012-03-15 11:07 ` Glauber Costa
2012-03-15 11:13 ` Peter Zijlstra
2012-03-15 11:21 ` Glauber Costa
2012-03-12 12:38 ` Glauber Costa
2012-03-09 20:39 ` [PATCH v2 03/13] memcg: Uncharge all kmem when deleting a cgroup Suleiman Souhlal
2012-03-11 8:19 ` Glauber Costa
2012-03-13 23:16 ` Suleiman Souhlal
2012-03-14 11:59 ` Glauber Costa
[not found] ` <1331325556-16447-4-git-send-email-ssouhlal-HZy0K5TPuP5AfugRpC6u6w@public.gmane.org>
2012-03-13 6:27 ` KAMEZAWA Hiroyuki
2012-03-09 20:39 ` [PATCH v2 04/13] memcg: Make it possible to use the stock for more than one page Suleiman Souhlal
[not found] ` <1331325556-16447-5-git-send-email-ssouhlal-HZy0K5TPuP5AfugRpC6u6w@public.gmane.org>
2012-03-11 10:49 ` Glauber Costa
2012-03-09 20:39 ` [PATCH v2 05/13] memcg: Reclaim when more than one page needed Suleiman Souhlal
2012-03-09 20:39 ` [PATCH v2 06/13] slab: Add kmem_cache_gfp_flags() helper function Suleiman Souhlal
2012-03-11 10:53 ` Glauber Costa
[not found] ` <4F5C8414.5090800-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-03-13 23:21 ` Suleiman Souhlal
[not found] ` <CABCjUKCioWO-F7k=hVs_18B3uyL4zG3-krPFDh++YAnmejKKdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-03-14 11:48 ` Glauber Costa
[not found] ` <4F608579.5090109-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-03-14 22:08 ` Suleiman Souhlal
2012-03-09 20:39 ` [PATCH v2 07/13] memcg: Slab accounting Suleiman Souhlal
[not found] ` <1331325556-16447-8-git-send-email-ssouhlal-HZy0K5TPuP5AfugRpC6u6w@public.gmane.org>
2012-03-11 10:25 ` Glauber Costa
2012-03-13 22:50 ` Suleiman Souhlal
2012-03-14 10:47 ` Glauber Costa
[not found] ` <4F60775F.20709-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-03-14 22:04 ` Suleiman Souhlal
[not found] ` <CABCjUKCWaXTzsVaFHG57ELWV4Yk15vt=Ei8tvbsxpQKnxTmksg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-03-15 11:40 ` Glauber Costa
2012-03-09 20:39 ` [PATCH v2 08/13] memcg: Make dentry slab memory accounted in kernel memory accounting Suleiman Souhlal
2012-03-09 20:39 ` [PATCH v2 09/13] memcg: Account for kmalloc " Suleiman Souhlal
2012-03-11 12:21 ` Glauber Costa
2012-03-09 20:39 ` [PATCH v2 10/13] memcg: Track all the memcg children of a kmem_cache Suleiman Souhlal
2012-03-09 20:39 ` [PATCH v2 11/13] memcg: Handle bypassed kernel memory charges Suleiman Souhlal
2012-03-09 20:39 ` [PATCH v2 12/13] memcg: Per-memcg memory.kmem.slabinfo file Suleiman Souhlal
2012-03-11 10:35 ` Glauber Costa
2012-03-09 20:39 ` [PATCH v2 13/13] memcg: Document kernel memory accounting Suleiman Souhlal
[not found] ` <1331325556-16447-14-git-send-email-ssouhlal-HZy0K5TPuP5AfugRpC6u6w@public.gmane.org>
2012-03-11 10:42 ` Glauber Costa
[not found] ` <1331325556-16447-1-git-send-email-ssouhlal-HZy0K5TPuP5AfugRpC6u6w@public.gmane.org>
2012-03-10 6:25 ` [PATCH v2 00/13] Memcg Kernel Memory Tracking Suleiman Souhlal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F5F236A.1070609@parallels.com \
--to=glommer@parallels.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=cgroups@vger.kernel.org \
--cc=cl@linux.com \
--cc=dan.magenheimer@oracle.com \
--cc=devel@openvz.org \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=penberg@kernel.org \
--cc=peterz@infradead.org \
--cc=ssouhlal@FreeBSD.org \
--cc=suleiman@google.com \
--cc=yinghan@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).