From: Kirill Korotaev <dev@sw.ru>
To: vatsa@in.ibm.com
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>,
Andrew Morton <akpm@osdl.org>,
mingo@elte.hu, nickpiggin@yahoo.com.au, sam@vilain.net,
linux-kernel@vger.kernel.org, dev@openvz.org, efault@gmx.de,
balbir@in.ibm.com, sekharan@us.ibm.com, nagar@watson.ibm.com,
haveblue@us.ibm.com, pj@sgi.com, saw@sawoct.com
Subject: Re: [RFC, PATCH 0/5] Going forward with Resource Management - A cpu controller
Date: Fri, 04 Aug 2006 20:16:53 +0400 [thread overview]
Message-ID: <44D372F5.5000901@sw.ru> (raw)
In-Reply-To: <20060804114109.GA28988@in.ibm.com>
>>I think the risk is that OpenVZ has all the controls and resource
>>managers we need, while CKRM is still more research-ish. I find the
>>OpenVZ code much clearer, cleaner and complete at the moment, although
>>also much more conservative in its approach to solving problems.
>
>
> I think it would be nice to compare first the features provided by ckrm and
> openvz at some point and agree upon the minimum common features we need to have
> as we go forward. For instance I think Openvz assumes that tasks do
> not need to move between containers (task-groups), whereas ckrm provides this
> flexibility for workload management. This may have some effect on the
> controller/interface design, no?
BTW, to help to compare (as you noted above) here is the list of features provided by OpenVZ:
Memory and some other resources related to mem
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- kernel memory. vmas, LDT, page tables, poll, select, ipc undos and many other kernel
structures which can be created on user requests.
without it's accounting/limiting a system is DoS'able.
user memory (private memory, shared memory, tmpfs, swap):
- locked pages
- shmpages
- physpages. accounting only. Correctly accounts fractions of memory
shared between containers. Can't be limited in a user friendly manner,
since memory denials from page faults are not handled from user space :/
- private memory pages. These are private pages which has are not backed up
in the file or swap and which are pure user pages. These are anonymous
private mappings and cow-able mappings (e.g. glibc .data) which result in private memory.
Accounted correctly taking into acount sharing between containers (i.e. page
fraction is accounted).
This resource is limited on mmap() call.
others:
- 2-level OOM killer. The most fat container should be selected to kill first.
We introduce some guarantee against OOM, so that if the container
consumes less memory than it is guaranteed to, then it won't be killed.
- memory pinned by dcache (there is a simple DoS which can be done
by any Linux user to consume the whole normal zone)
- number of iptables entries (with virtualized networking
containers can allocate memory for iptable rules)
- other socket buffers (unix, netlinks)
- TCP rcv/snd buffers
- UDP rcv buffers
- number of TCP sockets
- number of unix/netlink/other sockets
- number of flocks
- number of ptys
- number of siginfo's
- number of files
- number of tasks
CPU management
~~~~~~~~~~~~~~
1. 2 level fair CPU scheduler with known theoretical fairness and latency bounds:
- 1st level selects a container to run based on the container weight
- 2nd level selects a runqueue in the container and a task in the runqueue
2. cpu limits. Limitation of the container to some CPU rate even if CPUs are idle.
2 level disk quota
~~~~~~~~~~~~~~~~~~
allows to limit directory subtree to some amount of disk space.
inside this quota std linux per-user quotas are available.
Thanks,
Kirill
next prev parent reply other threads:[~2006-08-04 16:16 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-04 5:07 [RFC, PATCH 0/5] Going forward with Resource Management - A cpu controller Srivatsa Vaddagiri
2006-08-04 5:09 ` [ RFC, PATCH 1/5 ] CPU controller - base changes Srivatsa Vaddagiri
2006-08-04 7:35 ` Andrew Morton
2006-08-04 11:18 ` Srivatsa Vaddagiri
2006-08-04 14:34 ` Kirill Korotaev
2006-08-04 14:50 ` Balbir Singh
2006-08-04 14:51 ` Srivatsa Vaddagiri
2006-08-04 5:10 ` [ RFC, PATCH 2/5 ] CPU controller - Define group operations Srivatsa Vaddagiri
2006-08-04 23:10 ` Jiri Slaby
2006-08-04 5:11 ` [ RFC, PATCH 3/5 ] CPU controller - deal with movement of tasks Srivatsa Vaddagiri
2006-08-04 5:12 ` [ RFC, PATCH 4/5 ] CPU controller - deal with dont care groups Srivatsa Vaddagiri
2006-08-04 5:13 ` [ RFC, PATCH 5/5 ] CPU controller - interface with cpusets Srivatsa Vaddagiri
2006-08-04 5:36 ` [RFC, PATCH 0/5] Going forward with Resource Management - A cpu controller Andrew Morton
2006-08-04 5:42 ` Andrew Morton
2006-08-04 9:49 ` Alan Cox
2006-08-04 11:41 ` Srivatsa Vaddagiri
2006-08-04 14:51 ` Kirill Korotaev
2006-08-04 15:31 ` Srivatsa Vaddagiri
2006-08-04 16:03 ` Kirill Korotaev
2006-08-04 17:02 ` [ProbableSpam] " Shailabh Nagar
2006-08-04 18:27 ` Rohit Seth
2006-08-04 19:11 ` Shailabh Nagar
2006-08-04 19:24 ` Rohit Seth
2006-08-07 7:19 ` Kirill Korotaev
2006-08-07 17:14 ` Rohit Seth
2006-08-08 7:17 ` Kirill Korotaev
2006-08-08 17:16 ` Rohit Seth
2006-08-04 17:50 ` Martin Bligh
2006-08-07 7:25 ` Kirill Korotaev
2006-08-07 14:34 ` Martin J. Bligh
2006-08-07 16:33 ` Kirill Korotaev
2006-08-07 18:31 ` Rohit Seth
2006-08-07 18:43 ` Dave Hansen
2006-08-07 19:00 ` Rohit Seth
2006-08-07 19:46 ` Martin Bligh
2006-08-08 14:19 ` memory resource accounting (was Re: [RFC, PATCH 0/5] Going forward with Resource Management - A cpu controller) Nick Piggin
2006-08-08 14:57 ` Dave Hansen
2006-08-08 15:22 ` Nick Piggin
2006-08-09 13:43 ` Kirill Korotaev
2006-08-08 17:08 ` Martin Bligh
2006-08-09 1:54 ` Nick Piggin
2006-08-08 17:34 ` Rohit Seth
2006-08-09 4:33 ` Andi Kleen
2006-08-09 6:00 ` Magnus Damm
2006-08-09 6:06 ` Andi Kleen
2006-08-09 6:56 ` Andrey Savochkin
2006-08-08 7:19 ` [RFC, PATCH 0/5] Going forward with Resource Management - A cpu controller Kirill Korotaev
2006-08-04 16:16 ` Kirill Korotaev [this message]
2006-08-04 16:49 ` [ProbableSpam] " Shailabh Nagar
2006-08-04 17:03 ` Dipankar Sarma
2006-08-04 18:17 ` Shailabh Nagar
2006-08-07 7:23 ` Kirill Korotaev
2006-08-04 14:57 ` Kirill Korotaev
2006-08-04 5:58 ` Paul Jackson
2006-08-04 6:02 ` Paul Jackson
2006-08-04 6:16 ` Paul Jackson
2006-08-04 6:20 ` Dipankar Sarma
2006-08-04 6:31 ` Paul Jackson
2006-08-04 6:37 ` Dipankar Sarma
2006-08-04 6:49 ` Andrew Morton
2006-08-04 6:45 ` Andrew Morton
2006-08-04 7:10 ` Dipankar Sarma
2006-08-04 7:24 ` Andrew Morton
2006-08-04 19:10 ` Chandra Seetharaman
2006-08-04 6:56 ` Srivatsa Vaddagiri
2006-08-04 7:13 ` Andrew Morton
2006-08-04 11:16 ` Srivatsa Vaddagiri
2006-08-04 18:51 ` Andrew Morton
2006-08-04 14:20 ` Kirill Korotaev
2006-08-04 14:35 ` Christoph Hellwig
2006-08-04 15:29 ` [ProbableSpam] " Shailabh Nagar
2006-08-07 7:29 ` Kirill Korotaev
2006-08-07 9:30 ` Paul Jackson
2006-08-07 15:58 ` Chandra Seetharaman
2006-08-07 16:10 ` Kirill Korotaev
2006-08-07 17:15 ` Paul Jackson
2006-08-07 18:19 ` Rohit Seth
2006-08-05 3:30 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44D372F5.5000901@sw.ru \
--to=dev@sw.ru \
--cc=akpm@osdl.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=balbir@in.ibm.com \
--cc=dev@openvz.org \
--cc=efault@gmx.de \
--cc=haveblue@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=nagar@watson.ibm.com \
--cc=nickpiggin@yahoo.com.au \
--cc=pj@sgi.com \
--cc=sam@vilain.net \
--cc=saw@sawoct.com \
--cc=sekharan@us.ibm.com \
--cc=vatsa@in.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.