All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>,
	Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/7] mm: memcontrol: charge swap to cgroup2
Date: Wed, 16 Dec 2015 12:18:30 +0900	[thread overview]
Message-ID: <5670D806.60408@jp.fujitsu.com> (raw)
In-Reply-To: <20151215145011.GA20355@cmpxchg.org>

On 2015/12/15 23:50, Johannes Weiner wrote:
> On Tue, Dec 15, 2015 at 12:22:41PM +0900, Kamezawa Hiroyuki wrote:
>> On 2015/12/15 4:42, Vladimir Davydov wrote:
>>> Anyway, if you don't trust a container you'd better set the hard memory
>>> limit so that it can't hurt others no matter what it runs and how it
>>> tweaks its sub-tree knobs.
>>
>> Limiting swap can easily cause "OOM-Killer even while there are available swap"
>> with easy mistake. Can't you add "swap excess" switch to sysctl to allow global
>> memory reclaim can ignore swap limitation ?
>
> That never worked with a combined memory+swap limit, either. How could
> it? The parent might swap you out under pressure, but simply touching
> a few of your anon pages causes them to get swapped back in, thrashing
> with whatever the parent was trying to do. Your ability to swap it out
> is simply no protection against a group touching its pages.
>
> Allowing the parent to exceed swap with separate counters makes even
> less sense, because every page swapped out frees up a page of memory
> that the child can reuse. For every swap page that exceeds the limit,
> the child gets a free memory page! The child doesn't even have to
> cause swapin, it can just steal whatever the parent tried to free up,
> and meanwhile its combined memory & swap footprint explodes.
>
Sure.

> The answer is and always should have been: don't overcommit untrusted
> cgroups. Think of swap as a resource you distribute, not as breathing
> room for the parents to rely on. Because it can't and could never.
>
ok, don't overcommmit.

> And the new separate swap counter makes this explicit.
>
Hmm, my requests are
  - set the same capabilities as mlock() to set swap.limit=0
  - swap-full notification via vmpressure or something mechanism.
  - OOM-Killer's available memory calculation may be corrupted, please check.
  - force swap-in at reducing swap.limit

Thanks,
-Kame


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>,
	Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/7] mm: memcontrol: charge swap to cgroup2
Date: Wed, 16 Dec 2015 12:18:30 +0900	[thread overview]
Message-ID: <5670D806.60408@jp.fujitsu.com> (raw)
In-Reply-To: <20151215145011.GA20355@cmpxchg.org>

On 2015/12/15 23:50, Johannes Weiner wrote:
> On Tue, Dec 15, 2015 at 12:22:41PM +0900, Kamezawa Hiroyuki wrote:
>> On 2015/12/15 4:42, Vladimir Davydov wrote:
>>> Anyway, if you don't trust a container you'd better set the hard memory
>>> limit so that it can't hurt others no matter what it runs and how it
>>> tweaks its sub-tree knobs.
>>
>> Limiting swap can easily cause "OOM-Killer even while there are available swap"
>> with easy mistake. Can't you add "swap excess" switch to sysctl to allow global
>> memory reclaim can ignore swap limitation ?
>
> That never worked with a combined memory+swap limit, either. How could
> it? The parent might swap you out under pressure, but simply touching
> a few of your anon pages causes them to get swapped back in, thrashing
> with whatever the parent was trying to do. Your ability to swap it out
> is simply no protection against a group touching its pages.
>
> Allowing the parent to exceed swap with separate counters makes even
> less sense, because every page swapped out frees up a page of memory
> that the child can reuse. For every swap page that exceeds the limit,
> the child gets a free memory page! The child doesn't even have to
> cause swapin, it can just steal whatever the parent tried to free up,
> and meanwhile its combined memory & swap footprint explodes.
>
Sure.

> The answer is and always should have been: don't overcommit untrusted
> cgroups. Think of swap as a resource you distribute, not as breathing
> room for the parents to rely on. Because it can't and could never.
>
ok, don't overcommmit.

> And the new separate swap counter makes this explicit.
>
Hmm, my requests are
  - set the same capabilities as mlock() to set swap.limit=0
  - swap-full notification via vmpressure or something mechanism.
  - OOM-Killer's available memory calculation may be corrupted, please check.
  - force swap-in at reducing swap.limit

Thanks,
-Kame



  reply	other threads:[~2015-12-16  3:19 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-10 11:39 [PATCH 0/7] Add swap accounting to cgroup2 Vladimir Davydov
2015-12-10 11:39 ` Vladimir Davydov
2015-12-10 11:39 ` [PATCH 1/7] mm: memcontrol: charge swap " Vladimir Davydov
2015-12-10 11:39   ` Vladimir Davydov
2015-12-10 16:00   ` Johannes Weiner
2015-12-10 16:00     ` Johannes Weiner
2015-12-10 17:00     ` Vladimir Davydov
2015-12-10 17:00       ` Vladimir Davydov
2015-12-11  2:48   ` Kamezawa Hiroyuki
2015-12-11  2:48     ` Kamezawa Hiroyuki
2015-12-11  7:39     ` Vladimir Davydov
2015-12-11  7:39       ` Vladimir Davydov
2015-12-14 15:30   ` Michal Hocko
2015-12-14 15:30     ` Michal Hocko
2015-12-14 15:48     ` Johannes Weiner
2015-12-14 15:48       ` Johannes Weiner
2015-12-14 19:42     ` Vladimir Davydov
2015-12-14 19:42       ` Vladimir Davydov
2015-12-14 19:52       ` One Thousand Gnomes
2015-12-14 19:52         ` One Thousand Gnomes
2015-12-15  3:22       ` Kamezawa Hiroyuki
2015-12-15  3:22         ` Kamezawa Hiroyuki
2015-12-15 11:02         ` Vladimir Davydov
2015-12-15 11:02           ` Vladimir Davydov
2015-12-16  2:44           ` Kamezawa Hiroyuki
2015-12-16  2:44             ` Kamezawa Hiroyuki
2015-12-15 14:50         ` Johannes Weiner
2015-12-15 14:50           ` Johannes Weiner
2015-12-16  3:18           ` Kamezawa Hiroyuki [this message]
2015-12-16  3:18             ` Kamezawa Hiroyuki
2015-12-16 11:09             ` Johannes Weiner
2015-12-16 11:09               ` Johannes Weiner
2015-12-17  2:46               ` Kamezawa Hiroyuki
2015-12-17  2:46                 ` Kamezawa Hiroyuki
2015-12-17  3:32                 ` Johannes Weiner
2015-12-17  3:32                   ` Johannes Weiner
2015-12-17  4:29                   ` Kamezawa Hiroyuki
2015-12-17  4:29                     ` Kamezawa Hiroyuki
2015-12-15 17:21       ` Michal Hocko
2015-12-15 17:21         ` Michal Hocko
2015-12-15 20:22         ` Johannes Weiner
2015-12-15 20:22           ` Johannes Weiner
2015-12-16  3:57         ` Kamezawa Hiroyuki
2015-12-16  3:57           ` Kamezawa Hiroyuki
2015-12-15  3:12     ` Kamezawa Hiroyuki
2015-12-15  3:12       ` Kamezawa Hiroyuki
2015-12-15  8:30       ` Vladimir Davydov
2015-12-15  8:30         ` Vladimir Davydov
2015-12-15  9:29         ` Kamezawa Hiroyuki
2015-12-15  9:29           ` Kamezawa Hiroyuki
2015-12-10 11:39 ` [PATCH 2/7] mm: vmscan: pass memcg to get_scan_count() Vladimir Davydov
2015-12-10 11:39   ` Vladimir Davydov
2015-12-11 19:24   ` Johannes Weiner
2015-12-11 19:24     ` Johannes Weiner
2015-12-10 11:39 ` [PATCH 3/7] mm: memcontrol: replace mem_cgroup_lruvec_online with mem_cgroup_online Vladimir Davydov
2015-12-10 11:39   ` Vladimir Davydov
2015-12-11 19:25   ` Johannes Weiner
2015-12-11 19:25     ` Johannes Weiner
2015-12-10 11:39 ` [PATCH 4/7] swap.h: move memcg related stuff to the end of the file Vladimir Davydov
2015-12-10 11:39   ` Vladimir Davydov
2015-12-11 19:25   ` Johannes Weiner
2015-12-11 19:25     ` Johannes Weiner
2015-12-10 11:39 ` [PATCH 5/7] mm: vmscan: do not scan anon pages if memcg swap limit is hit Vladimir Davydov
2015-12-10 11:39   ` Vladimir Davydov
2015-12-11 19:27   ` Johannes Weiner
2015-12-11 19:27     ` Johannes Weiner
2015-12-10 11:39 ` [PATCH 6/7] mm: free swap cache aggressively if memcg swap is full Vladimir Davydov
2015-12-10 11:39   ` Vladimir Davydov
2015-12-11 19:33   ` Johannes Weiner
2015-12-11 19:33     ` Johannes Weiner
2015-12-12 16:18     ` Vladimir Davydov
2015-12-12 16:18       ` Vladimir Davydov
2015-12-10 11:39 ` [PATCH 7/7] Documentation: cgroup: add memory.swap.{current,max} description Vladimir Davydov
2015-12-10 11:39   ` Vladimir Davydov
2015-12-11 19:42   ` Johannes Weiner
2015-12-11 19:42     ` Johannes Weiner
2015-12-12 16:19     ` Vladimir Davydov
2015-12-12 16:19       ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5670D806.60408@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vdavydov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.