From: "Paul Menage" <menage@google.com>
To: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: balbir@linux.vnet.ibm.com, minoura@valinux.co.jp,
nishimura@mxp.nes.nec.co.jp, linux-mm@kvack.org,
containers@lists.osdl.org, hugh@veritas.com
Subject: Re: [RFC][PATCH] another swap controller for cgroup
Date: Wed, 14 May 2008 01:44:38 -0700 [thread overview]
Message-ID: <6599ad830805140144k583f7426k4024dd17a6cd3eb8@mail.gmail.com> (raw)
In-Reply-To: <20080514032125.46F7D5A07@siro.lan>
On Tue, May 13, 2008 at 8:21 PM, YAMAMOTO Takashi
<yamamoto@valinux.co.jp> wrote:
> >
> > Could you please mention what the limitations are? We could get those fixed or
> > take another serious look at the mm->owner patches.
>
> for example, its callback can't sleep.
>
You need to be able to sleep in order to take mmap_sem, right?
Since I think that the other current user of the mm->owner callback
probably also needs mmap_sem, it might make sense to take mmap_sem in
mm_update_next_owner() prior to locking the old owner, and hold it
across the callback, which would presumably solve the problem.
> >
> > Isn't it bad to force a group to go over it's limit due to migration?
>
> we don't have many choices as far as ->attach can't fail.
> although we can have racy checks in ->can_attach, i'm not happy with it.
can_attach() isn't racy iff you ensure that a successful result from
can_attach() can't be invalidated by any code not holding
cgroup_mutex.
The existing user of can_attach() is cpusets, and the only way to make
an attachable cpuset non-attachable is to remove its last node or cpu.
The only code that can do this (update_nodemask, update_cpumask, and
common_cpu_mem_hotplug_unplug) all call cgroup_lock() to ensure that
this synchronization occurs.
Of course, having lots of datapath operations also take cgroup_mutex
would be really painful, so it's not practical to use for things that
can become non-attachable due to a process consuming some resources.
This is part of the reason that I started working on the lock-mode
patches that I sent out yesterday, in order to make finer-grained
locking simpler. I'm going to rework those to make the locking more
explicit, and I'll bear this use case in mind while I'm doing it.
A few comments on the patch:
- you're not really limiting swap usage, you're limiting swapped-out
address space. So it looks as though if a process has swapped out most
of its address space, and forks a child, the total "swap" charge for
the cgroup will double. Is that correct? If so, why is this better
than charging for actual swap usage?
- what will happen if someone creates non-NPTL threads, which share an
mm but not a thread group (so each of them is a thread group leader)?
- if you were to store a pointer in the page rather than the
swap_cgroup pointer, then (in combination with mm->owner) you wouldn't
need to do the rebinding to the new swap_cgroup when a process moves
to a different cgroup - you could instead keep a "swapped pte" count
in the mm, and just charge that to the new cgroup and uncharge it from
the old cgroup. You also wouldn't need to keep ref counts on the
swap_cgroup.
- ideally this wouldn't actually start charging until it was bound on
to a cgroups hierarchy, although I guess that the performance of this
is less important than something like the virtual address space
controller, since once we start swapping we can expect performance to
be bad anyway.
Paul
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-05-14 8:44 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-17 2:04 [RFC][PATCH] another swap controller for cgroup YAMAMOTO Takashi
2008-03-17 5:11 ` Balbir Singh
2008-03-17 8:15 ` Daisuke Nishimura
2008-03-17 8:50 ` YAMAMOTO Takashi
2008-04-29 22:50 ` YAMAMOTO Takashi
2008-04-30 4:09 ` Daisuke Nishimura
2008-05-22 4:46 ` YAMAMOTO Takashi
2008-05-22 4:54 ` Daisuke Nishimura
2008-05-05 6:11 ` Balbir Singh
2008-05-07 5:50 ` YAMAMOTO Takashi
2008-05-08 15:43 ` Balbir Singh
2008-05-14 3:21 ` YAMAMOTO Takashi
2008-05-14 3:27 ` Paul Menage
2008-05-14 8:44 ` Paul Menage [this message]
2008-05-15 6:23 ` YAMAMOTO Takashi
2008-05-15 7:19 ` Paul Menage
2008-05-15 8:56 ` YAMAMOTO Takashi
2008-05-15 12:01 ` Daisuke Nishimura
2008-05-19 4:14 ` YAMAMOTO Takashi
2008-03-24 12:10 ` Daisuke Nishimura
2008-03-24 12:22 ` Balbir Singh
2008-03-25 6:46 ` Daisuke Nishimura
2008-03-25 3:10 ` YAMAMOTO Takashi
2008-03-25 4:35 ` Daisuke Nishimura
2008-03-25 8:57 ` YAMAMOTO Takashi
2008-03-25 12:35 ` Daisuke Nishimura
2008-03-27 6:28 ` YAMAMOTO Takashi
2008-03-28 9:00 ` Daisuke Nishimura
[not found] ` <47ECB3B1.6040500-YQH0OdQVrdy45+QrQBaojngSJqDPrsil@public.gmane.org>
2008-04-08 3:29 ` YAMAMOTO Takashi
2008-04-10 7:40 ` YAMAMOTO Takashi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6599ad830805140144k583f7426k4024dd17a6cd3eb8@mail.gmail.com \
--to=menage@google.com \
--cc=balbir@linux.vnet.ibm.com \
--cc=containers@lists.osdl.org \
--cc=hugh@veritas.com \
--cc=linux-mm@kvack.org \
--cc=minoura@valinux.co.jp \
--cc=nishimura@mxp.nes.nec.co.jp \
--cc=yamamoto@valinux.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).