linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov@parallels.com>
To: David Rientjes <rientjes@google.com>
Cc: akpm@linux-foundation.org, mhocko@suse.cz, penberg@kernel.org,
	cl@linux.com, glommer@gmail.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, devel@openvz.org
Subject: Re: [PATCH 1/8] memcg: export kmemcg cache id via cgroup fs
Date: Mon, 3 Feb 2014 17:00:10 +0400	[thread overview]
Message-ID: <52EF92DA.1060607@parallels.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1402030250110.31061@chino.kir.corp.google.com>

On 02/03/2014 03:04 PM, David Rientjes wrote:
> On Mon, 3 Feb 2014, Vladimir Davydov wrote:
>
>> AFAIU, cgroup identifiers dumped on oom (cgroup paths, currently) and
>> memcg slab cache names serve for different purposes.
> Sure, you may dump the name for a number of legitimate reasons, but the 
> problem still exists that it's difficult to determine what memcg is being 
> referenced without a flat hierarchy and unique memcg names for all 
> children.
>
>> The point is oom is
>> a perfectly normal situation for the kernel, and info dumped to dmesg is
>> for admin to find out the cause of the problem (a greedy user or
>> cgroup).
> Hmm, so if we hand out top-level memcgs to individual jobs or users, like 
> our userspace does, and they are able to configure their child memcgs as 
> they wish, and then they or the admin finds in the kernel log that a 
> memory hog was killed from the memcg with the perfectly anonymous memcg 
> name of "memcg", how do we determine what job or user triggered that kill?  
> User id is not going to be conclusive in a production environment with 
> shared user accounts.
>
>> On the other hand, slab cache names are dumped to dmesg only on
>> extraordinary situations - like bugs in slab implementation, or double
>> free, or detected memory leaks - where we usually do not need the name
>> of the memcg that triggered the problem, because the bug is likely to be
>> in the kernel subsys using the cache.
> There's certainly overlap here since slab leaks triggered by a particular 
> workload, perhaps by usage of a particular syscall, can occur and cause 
> oom killing but the problem remains that neither the memcg name nor the 
> slab cache name may be conclusive to determine what job or user triggered 
> the issue.  That's why we make strict demands that memcg names are always 
> unique and encode several key values to identify the user and job and we 
> don't rely on the parent.
>
> I can also see the huge maintenance burden it would be to keep around a 
> mapping of kmem ids to {user, job} pairs just in case we later identify a 
> problem and in 99% of the cases would be just wasted storage.
>
>> Plus, the names are exported to
>> sysfs in case of slub, again for debugging purposes, AFAIK. So IMO the
>> use cases for oom vs slab names are completely different - information
>> vs debugging - and I want to export kmem.id only for the ability of
>> debugging kmemcg and slab subsystems.
>>
> Eeek, I'm not sure I agree.  I've often found that reproducing rare slab 
> issues is very difficult without knowledge of the workload so that I can 
> reproduce it.  Whereas X is a very large number of machines and we see 
> this issue on 0.0001% of X machines, I would be required to enable this 
> "debugging" aid unconditionally to ever be able to map the stored kmem id 
> back to a user and job, that mapping would be extremely costly to 
> maintain, and we've gained nothing if we had already demanded that 
> userspace identify their memcg names with unique identifiers regardless of 
> where they are in the hierarchy.

I see your point, and it sounds quite reasonable to me. So I guess I'll
drop the patch removing the cgroup name part from slab cache names
(patch 2) and resend.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-02-03 13:00 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-02 16:33 [PATCH 0/8] memcg-vs-slab related fixes, improvements, cleanups Vladimir Davydov
2014-02-02 16:33 ` [PATCH 1/8] memcg: export kmemcg cache id via cgroup fs Vladimir Davydov
2014-02-03  6:21   ` David Rientjes
2014-02-03  6:57     ` Vladimir Davydov
2014-02-03  7:19       ` Vladimir Davydov
2014-02-03 10:05       ` Glauber Costa
2014-02-03 13:01         ` Vladimir Davydov
2014-02-03 11:04       ` David Rientjes
2014-02-03 13:00         ` Vladimir Davydov [this message]
2014-02-04 14:44       ` Michal Hocko
2014-02-04 14:40   ` Michal Hocko
2014-02-04 14:49     ` Vladimir Davydov
2014-02-02 16:33 ` [PATCH 2/8] memcg, slab: remove cgroup name from memcg cache names Vladimir Davydov
2014-02-04 14:45   ` Michal Hocko
2014-02-04 15:11     ` Vladimir Davydov
2014-02-04 15:13       ` Michal Hocko
2014-02-02 16:33 ` [PATCH 3/8] memcg, slab: never try to merge memcg caches Vladimir Davydov
2014-02-04 14:52   ` Michal Hocko
2014-02-04 14:59     ` Vladimir Davydov
2014-02-04 15:11       ` Michal Hocko
2014-02-04 15:27         ` Vladimir Davydov
2014-02-04 15:43           ` Glauber Costa
2014-02-04 16:04             ` Vladimir Davydov
2014-02-04 16:10               ` Glauber Costa
2014-02-06 14:07           ` Michal Hocko
2014-02-06 14:15             ` Vladimir Davydov
2014-02-06 15:29               ` Michal Hocko
2014-02-06 15:39                 ` Vladimir Davydov
2014-02-02 16:33 ` [PATCH 4/8] memcg, slab: separate memcg vs root cache creation paths Vladimir Davydov
2014-02-02 16:33 ` [PATCH 5/8] slub: adjust memcg caches when creating cache alias Vladimir Davydov
2014-02-02 16:33 ` [PATCH 6/8] slub: rework sysfs layout for memcg caches Vladimir Davydov
2014-02-02 16:33 ` [PATCH 7/8] memcg, slab: unregister cache from memcg before starting to destroy it Vladimir Davydov
2014-02-02 16:33 ` [PATCH 8/8] memcg, slab: do not destroy children caches if parent has aliases Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52EF92DA.1060607@parallels.com \
    --to=vdavydov@parallels.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=devel@openvz.org \
    --cc=glommer@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).