From: Glauber Costa <glommer@parallels.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: davem@davemloft.net, linux-kernel@vger.kernel.org,
paul@paulmenage.org, lizf@cn.fujitsu.com,
kamezawa.hiroyu@jp.fujitsu.com, ebiederm@xmission.com,
gthelen@google.com, netdev@vger.kernel.org, linux-mm@kvack.org,
kirill@shutemov.name, avagin@parallels.com, devel@openvz.org,
eric.dumazet@gmail.com, cgroups@vger.kernel.org,
Johannes Weiner <jweiner@redhat.com>
Subject: Re: [PATCH v9 1/9] Basic kernel memory functionality for the Memory Controller
Date: Fri, 16 Dec 2011 17:02:51 +0400 [thread overview]
Message-ID: <4EEB417B.8000508@parallels.com> (raw)
In-Reply-To: <20111216123233.GF3122@tiehlicka.suse.cz>
On 12/16/2011 04:32 PM, Michal Hocko wrote:
> On Thu 15-12-11 16:29:18, Glauber Costa wrote:
>> On 12/14/2011 09:04 PM, Michal Hocko wrote:
>>> [Now with the current patch version, I hope]
>>> On Mon 12-12-11 11:47:01, Glauber Costa wrote:
> [...]
>>>> @@ -3848,10 +3862,17 @@ static inline u64 mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
>>>> u64 val;
>>>>
>>>> if (!mem_cgroup_is_root(memcg)) {
>>>> + val = 0;
>>>> +#ifdef CONFIG_CGROUP_MEM_RES_CTLR_KMEM
>>>> + if (!memcg->kmem_independent_accounting)
>>>> + val = res_counter_read_u64(&memcg->kmem, RES_USAGE);
>>>> +#endif
>>>> if (!swap)
>>>> - return res_counter_read_u64(&memcg->res, RES_USAGE);
>>>> + val += res_counter_read_u64(&memcg->res, RES_USAGE);
>>>> else
>>>> - return res_counter_read_u64(&memcg->memsw, RES_USAGE);
>>>> + val += res_counter_read_u64(&memcg->memsw, RES_USAGE);
>>>> +
>>>> + return val;
>>>> }
>>>
>>> So you report kmem+user but we do not consider kmem during charge so one
>>> can easily end up with usage_in_bytes over limit but no reclaim is going
>>> on. Not good, I would say.
>
> I find this a problem and one of the reason I do not like !independent
> accounting.
>
>>>
>>> OK, so to sum it up. The biggest problem I see is the (non)independent
>>> accounting. We simply cannot mix user+kernel limits otherwise we would
>>> see issues (like kernel resource hog would force memcg-oom and innocent
>>> members would die because their rss is much bigger).
>>> It is also not clear to me what should happen when we hit the kmem
>>> limit. I guess it will be kmem cache dependent.
>>
>> So right now, tcp is completely independent, since it is not
>> accounted to kmem.
>
> So why do we need kmem accounting when tcp (the only user at the moment)
> doesn't use it?
Well, a bit historical. I needed a basic placeholder for it, since it
tcp is officially kmem. As the time passed, I took most of the stuff out
of this patch to leave just the basics I would need for tcp.
Turns out I ended up focusing on the rest, and some of the stuff was
left here.
At one point I merged tcp data into kmem, but then reverted this
behavior. the kmem counter stayed.
I agree deferring the whole behavior would be better.
>> In summary, we still never do non-independent accounting. When we
>> start doing it for the other caches, We will have to add a test at
>> charge time as well.
>
> So we shouldn't do it as a part of this patchset because the further
> usage is not clear and I think there will be some real issues with
> user+kmem accounting (e.g. a proper memcg-oom implementation).
> Can you just drop this patch?
Yes, but the whole set is in the net tree already. (All other patches
are tcp-related but this) Would you mind if I'd send a follow up patch
removing the kmem files, and leaving just the registration functions and
basic documentation? (And sorry for that as well in advance)
>> We still need to keep it separate though, in case the independent
>> flag is turned on/off
>
> I don't mind to have kmem.tcp.* knobs.
>
WARNING: multiple messages have this Message-ID (diff)
From: Glauber Costa <glommer@parallels.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: davem@davemloft.net, linux-kernel@vger.kernel.org,
paul@paulmenage.org, lizf@cn.fujitsu.com,
kamezawa.hiroyu@jp.fujitsu.com, ebiederm@xmission.com,
gthelen@google.com, netdev@vger.kernel.org, linux-mm@kvack.org,
kirill@shutemov.name, avagin@parallels.com, devel@openvz.org,
eric.dumazet@gmail.com, cgroups@vger.kernel.org,
Johannes Weiner <jweiner@redhat.com>
Subject: Re: [PATCH v9 1/9] Basic kernel memory functionality for the Memory Controller
Date: Fri, 16 Dec 2011 17:02:51 +0400 [thread overview]
Message-ID: <4EEB417B.8000508@parallels.com> (raw)
In-Reply-To: <20111216123233.GF3122@tiehlicka.suse.cz>
On 12/16/2011 04:32 PM, Michal Hocko wrote:
> On Thu 15-12-11 16:29:18, Glauber Costa wrote:
>> On 12/14/2011 09:04 PM, Michal Hocko wrote:
>>> [Now with the current patch version, I hope]
>>> On Mon 12-12-11 11:47:01, Glauber Costa wrote:
> [...]
>>>> @@ -3848,10 +3862,17 @@ static inline u64 mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
>>>> u64 val;
>>>>
>>>> if (!mem_cgroup_is_root(memcg)) {
>>>> + val = 0;
>>>> +#ifdef CONFIG_CGROUP_MEM_RES_CTLR_KMEM
>>>> + if (!memcg->kmem_independent_accounting)
>>>> + val = res_counter_read_u64(&memcg->kmem, RES_USAGE);
>>>> +#endif
>>>> if (!swap)
>>>> - return res_counter_read_u64(&memcg->res, RES_USAGE);
>>>> + val += res_counter_read_u64(&memcg->res, RES_USAGE);
>>>> else
>>>> - return res_counter_read_u64(&memcg->memsw, RES_USAGE);
>>>> + val += res_counter_read_u64(&memcg->memsw, RES_USAGE);
>>>> +
>>>> + return val;
>>>> }
>>>
>>> So you report kmem+user but we do not consider kmem during charge so one
>>> can easily end up with usage_in_bytes over limit but no reclaim is going
>>> on. Not good, I would say.
>
> I find this a problem and one of the reason I do not like !independent
> accounting.
>
>>>
>>> OK, so to sum it up. The biggest problem I see is the (non)independent
>>> accounting. We simply cannot mix user+kernel limits otherwise we would
>>> see issues (like kernel resource hog would force memcg-oom and innocent
>>> members would die because their rss is much bigger).
>>> It is also not clear to me what should happen when we hit the kmem
>>> limit. I guess it will be kmem cache dependent.
>>
>> So right now, tcp is completely independent, since it is not
>> accounted to kmem.
>
> So why do we need kmem accounting when tcp (the only user at the moment)
> doesn't use it?
Well, a bit historical. I needed a basic placeholder for it, since it
tcp is officially kmem. As the time passed, I took most of the stuff out
of this patch to leave just the basics I would need for tcp.
Turns out I ended up focusing on the rest, and some of the stuff was
left here.
At one point I merged tcp data into kmem, but then reverted this
behavior. the kmem counter stayed.
I agree deferring the whole behavior would be better.
>> In summary, we still never do non-independent accounting. When we
>> start doing it for the other caches, We will have to add a test at
>> charge time as well.
>
> So we shouldn't do it as a part of this patchset because the further
> usage is not clear and I think there will be some real issues with
> user+kmem accounting (e.g. a proper memcg-oom implementation).
> Can you just drop this patch?
Yes, but the whole set is in the net tree already. (All other patches
are tcp-related but this) Would you mind if I'd send a follow up patch
removing the kmem files, and leaving just the registration functions and
basic documentation? (And sorry for that as well in advance)
>> We still need to keep it separate though, in case the independent
>> flag is turned on/off
>
> I don't mind to have kmem.tcp.* knobs.
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Glauber Costa <glommer@parallels.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: <davem@davemloft.net>, <linux-kernel@vger.kernel.org>,
<paul@paulmenage.org>, <lizf@cn.fujitsu.com>,
<kamezawa.hiroyu@jp.fujitsu.com>, <ebiederm@xmission.com>,
<gthelen@google.com>, <netdev@vger.kernel.org>,
<linux-mm@kvack.org>, <kirill@shutemov.name>,
<avagin@parallels.com>, <devel@openvz.org>,
<eric.dumazet@gmail.com>, <cgroups@vger.kernel.org>,
Johannes Weiner <jweiner@redhat.com>
Subject: Re: [PATCH v9 1/9] Basic kernel memory functionality for the Memory Controller
Date: Fri, 16 Dec 2011 17:02:51 +0400 [thread overview]
Message-ID: <4EEB417B.8000508@parallels.com> (raw)
In-Reply-To: <20111216123233.GF3122@tiehlicka.suse.cz>
On 12/16/2011 04:32 PM, Michal Hocko wrote:
> On Thu 15-12-11 16:29:18, Glauber Costa wrote:
>> On 12/14/2011 09:04 PM, Michal Hocko wrote:
>>> [Now with the current patch version, I hope]
>>> On Mon 12-12-11 11:47:01, Glauber Costa wrote:
> [...]
>>>> @@ -3848,10 +3862,17 @@ static inline u64 mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
>>>> u64 val;
>>>>
>>>> if (!mem_cgroup_is_root(memcg)) {
>>>> + val = 0;
>>>> +#ifdef CONFIG_CGROUP_MEM_RES_CTLR_KMEM
>>>> + if (!memcg->kmem_independent_accounting)
>>>> + val = res_counter_read_u64(&memcg->kmem, RES_USAGE);
>>>> +#endif
>>>> if (!swap)
>>>> - return res_counter_read_u64(&memcg->res, RES_USAGE);
>>>> + val += res_counter_read_u64(&memcg->res, RES_USAGE);
>>>> else
>>>> - return res_counter_read_u64(&memcg->memsw, RES_USAGE);
>>>> + val += res_counter_read_u64(&memcg->memsw, RES_USAGE);
>>>> +
>>>> + return val;
>>>> }
>>>
>>> So you report kmem+user but we do not consider kmem during charge so one
>>> can easily end up with usage_in_bytes over limit but no reclaim is going
>>> on. Not good, I would say.
>
> I find this a problem and one of the reason I do not like !independent
> accounting.
>
>>>
>>> OK, so to sum it up. The biggest problem I see is the (non)independent
>>> accounting. We simply cannot mix user+kernel limits otherwise we would
>>> see issues (like kernel resource hog would force memcg-oom and innocent
>>> members would die because their rss is much bigger).
>>> It is also not clear to me what should happen when we hit the kmem
>>> limit. I guess it will be kmem cache dependent.
>>
>> So right now, tcp is completely independent, since it is not
>> accounted to kmem.
>
> So why do we need kmem accounting when tcp (the only user at the moment)
> doesn't use it?
Well, a bit historical. I needed a basic placeholder for it, since it
tcp is officially kmem. As the time passed, I took most of the stuff out
of this patch to leave just the basics I would need for tcp.
Turns out I ended up focusing on the rest, and some of the stuff was
left here.
At one point I merged tcp data into kmem, but then reverted this
behavior. the kmem counter stayed.
I agree deferring the whole behavior would be better.
>> In summary, we still never do non-independent accounting. When we
>> start doing it for the other caches, We will have to add a test at
>> charge time as well.
>
> So we shouldn't do it as a part of this patchset because the further
> usage is not clear and I think there will be some real issues with
> user+kmem accounting (e.g. a proper memcg-oom implementation).
> Can you just drop this patch?
Yes, but the whole set is in the net tree already. (All other patches
are tcp-related but this) Would you mind if I'd send a follow up patch
removing the kmem files, and leaving just the registration functions and
basic documentation? (And sorry for that as well in advance)
>> We still need to keep it separate though, in case the independent
>> flag is turned on/off
>
> I don't mind to have kmem.tcp.* knobs.
>
next prev parent reply other threads:[~2011-12-16 13:02 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-12 7:47 [PATCH v9 0/9] Request for inclusion: per-cgroup tcp memory pressure controls Glauber Costa
2011-12-12 7:47 ` Glauber Costa
2011-12-12 7:47 ` [PATCH v9 1/9] Basic kernel memory functionality for the Memory Controller Glauber Costa
2011-12-12 7:47 ` Glauber Costa
2011-12-14 17:04 ` Michal Hocko
2011-12-14 17:04 ` Michal Hocko
[not found] ` <20111214170447.GB4856-VqjxzfR4DlwKmadIfiO5sKVXKuFTiq87@public.gmane.org>
2011-12-15 12:29 ` Glauber Costa
2011-12-15 12:29 ` Glauber Costa
2011-12-15 12:29 ` Glauber Costa
2011-12-15 12:29 ` Glauber Costa
2011-12-16 12:32 ` Michal Hocko
2011-12-16 12:32 ` Michal Hocko
2011-12-16 13:02 ` Glauber Costa [this message]
2011-12-16 13:02 ` Glauber Costa
2011-12-16 13:02 ` Glauber Costa
[not found] ` <4EEB417B.8000508-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-12-16 13:30 ` Michal Hocko
2011-12-16 13:30 ` Michal Hocko
2011-12-16 13:30 ` Michal Hocko
[not found] ` <1323676029-5890-2-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2011-12-16 6:20 ` Greg Thelen
2011-12-16 6:20 ` Greg Thelen
2011-12-16 6:20 ` Greg Thelen
2011-12-12 7:47 ` [PATCH v9 2/9] foundations of per-cgroup memory pressure controlling Glauber Costa
2011-12-12 7:47 ` Glauber Costa
2011-12-12 7:47 ` [PATCH v9 3/9] socket: initial cgroup code Glauber Costa
2011-12-12 7:47 ` Glauber Costa
2011-12-22 21:10 ` Jason Baron
2011-12-22 21:10 ` Jason Baron
2011-12-23 8:57 ` Glauber Costa
2011-12-23 8:57 ` Glauber Costa
2011-12-23 8:57 ` Glauber Costa
2011-12-12 7:47 ` [PATCH v9 4/9] tcp memory pressure controls Glauber Costa
2011-12-12 7:47 ` Glauber Costa
2011-12-12 7:47 ` [PATCH v9 5/9] per-netns ipv4 sysctl_tcp_mem Glauber Costa
2011-12-12 7:47 ` Glauber Costa
2011-12-12 7:47 ` [PATCH v9 6/9] tcp buffer limitation: per-cgroup limit Glauber Costa
2011-12-12 7:47 ` Glauber Costa
2011-12-12 7:47 ` [PATCH v9 7/9] Display current tcp memory allocation in kmem cgroup Glauber Costa
2011-12-12 7:47 ` Glauber Costa
2011-12-12 7:47 ` [PATCH v9 8/9] Display current tcp failcnt " Glauber Costa
2011-12-12 7:47 ` Glauber Costa
2011-12-12 7:47 ` [PATCH v9 9/9] Display maximum tcp memory allocation " Glauber Costa
2011-12-12 7:47 ` Glauber Costa
2011-12-13 0:07 ` [PATCH v9 0/9] Request for inclusion: per-cgroup tcp memory pressure controls David Miller
2011-12-13 0:07 ` David Miller
2011-12-13 13:49 ` Christoph Paasch
2011-12-13 13:49 ` Christoph Paasch
2011-12-13 13:49 ` Christoph Paasch
2011-12-13 13:59 ` Eric Dumazet
2011-12-13 13:59 ` Eric Dumazet
2011-12-13 13:59 ` Eric Dumazet
2011-12-13 18:45 ` David Miller
2011-12-13 18:45 ` David Miller
2011-12-13 20:11 ` Glauber Costa
2011-12-13 20:11 ` Glauber Costa
2011-12-13 20:11 ` Glauber Costa
2011-12-15 5:40 ` KAMEZAWA Hiroyuki
2011-12-15 5:40 ` KAMEZAWA Hiroyuki
2011-12-15 5:40 ` KAMEZAWA Hiroyuki
2011-12-15 5:48 ` David Miller
2011-12-15 5:48 ` David Miller
2011-12-15 6:48 ` Glauber Costa
2011-12-15 6:48 ` Glauber Costa
2011-12-15 6:48 ` Glauber Costa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EEB417B.8000508@parallels.com \
--to=glommer@parallels.com \
--cc=avagin@parallels.com \
--cc=cgroups@vger.kernel.org \
--cc=davem@davemloft.net \
--cc=devel@openvz.org \
--cc=ebiederm@xmission.com \
--cc=eric.dumazet@gmail.com \
--cc=gthelen@google.com \
--cc=jweiner@redhat.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizf@cn.fujitsu.com \
--cc=mhocko@suse.cz \
--cc=netdev@vger.kernel.org \
--cc=paul@paulmenage.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.