All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kirill Korotaev <dev@sw.ru>
To: Andrew Morton <akpm@osdl.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	Pavel Emelianov <xemul@openvz.org>, Andrey Savochkin <saw@sw.ru>,
	devel@openvz.org, Rik van Riel <riel@redhat.com>,
	Andi Kleen <ak@suse.de>, Oleg Nesterov <oleg@tv-sign.ru>,
	Alexey Dobriyan <adobriyan@mail.ru>,
	Matt Helsley <matthltc@us.ibm.com>,
	CKRM-Tech <ckrm-tech@lists.sourceforge.net>,
	Hugh Dickins <hugh@veritas.com>
Subject: [PATCH] BC: resource beancounters (v4) (added user memory)
Date: Tue, 05 Sep 2006 19:02:34 +0400	[thread overview]
Message-ID: <44FD918A.7050501@sw.ru> (raw)

Core Resource Beancounters (BC) + kernel/user memory control.

BC allows to account and control consumption
of kernel resources used by group of processes.

Draft UBC description on OpenVZ wiki can be found at
http://wiki.openvz.org/UBC_parameters

The full BC patch set allows to control:
- kernel memory. All the kernel objects allocatable
on user demand should be accounted and limited
for DoS protection.
E.g. page tables, task structs, vmas etc.

- virtual memory pages. BCs allow to
limit a container to some amount of memory and
introduces 2-level OOM killer taking into account
container's consumption.
pages shared between containers are correctly
charged as fractions (tunable).

- network buffers. These includes TCP/IP rcv/snd
buffers, dgram snd buffers, unix, netlinks and
other buffers.

- minor resources accounted/limited by number:
tasks, files, flocks, ptys, siginfo, pinned dcache
mem, sockets, iptentries (for containers with
virtualized networking)

As the first step we want to propose for discussion
the most complicated parts of resource management:
kernel memory and virtual memory.
The patch set to be sent provides core for BC and
management of kernel memory only. Virtual memory
management will be sent in a couple of days.

The patches in these series are:
diff-atomic-dec-and-lock-irqsave.patch
 introduce atomic_dec_and_lock_irqsave()

diff-bc-kconfig.patch:
 Adds kernel/bc/Kconfig file with UBC options and
 includes it into arch Kconfigs

diff-bc-core.patch:
 Contains core functionality and interfaces of BC:
 find/create beancounter, initialization,
 charge/uncharge of resource, core objects' declarations.

diff-bc-task.patch:
 Contains code responsible for setting BC on task,
 it's inheriting and setting host context in interrupts.

 Task contains three beancounters:
 1. exec_bc  - current context. all resources are charged
               to this beancounter.
 2. fork_bc  - beancounter which is inherited by
               task's children on fork

diff-bc-syscalls.patch:
 Patch adds system calls for BC management:
 1. sys_get_bcid    - get current BC id
 2. sys_set_bcid    - changes exec_ and fork_ BCs on current
 3. sys_set_bclimit - set limits for resources consumtions
 4. sys_get_bcstat  - returns limits/usages/fails for BC

diff-bc-kmem-core.patch:
 Introduces BC_KMEMSIZE resource which accounts kernel
 objects allocated by task's request.

 Objects are accounted via struct page and slab objects.
 For the latter ones each slab contains a set of pointers
 corresponding object is charged to.

 Allocation charge rules:
 1. Pages - if allocation is performed with __GFP_BC flag - page
    is charged to current's exec_bc.
 2. Slabs - kmem_cache may be created with SLAB_BC flag - in this
    case each allocation is charged. Caches used by kmalloc are
    created with SLAB_BC | SLAB_BC_NOCHARGE flags. In this case
    only __GFP_BC allocations are charged.

diff-bc-kmem-charge.patch:
 Adds SLAB_BC and __GFP_BC flags in appropriate places
 to cause charging/limiting of specified resources.

diff-bc-vmlocked-core.patch:
Introduces new resource BC_LOCKEDPAGES for accounting
of mlock-ed user pages.

diff-bc-vmlocked-charge.patch:
Places calls to BC core over the kernel to charge locked memory.

diff-bc-privvm.patch:
This patch instroduces new resource - BC_PRIVVMPAGES.
Privvmpages acointing is described in details in
http://wiki.openvz.org/User_pages_accounting

diff-bc-vmrss-prep.patch:
This patch intruduces small preparations for vmrss accounting
to make reviewing simpler.

diff-bc-vmrss-core.patch:
This is the core of vmrss accounting.
Pages are accounted in fractions and it is described in details in
http://wiki.openvz.org/RSS_fractions_accounting

diff-bc-vmrss-charge.patch:
Calls to vmrss core code over the kernel to do accounting.


Summary of changes from v3 patch set:

* Added basic user pages accounting (lockedpages/privvmpages)
* spell in Kconfig
* Makefile reworked
* EXPORT_SYMBOL_GPL
* union w/o name in struct page
* bc_task_charge is void now
* adjust minheld/maxheld splitted

Summary of changes from v2 patch set:

* introduced atomic_dec_and_lock_irqsave()
* bc_adjust_held_minmax comment
* added __must_check for bc_*charge* funcs
* use hash_long() instead of own one
* bc/Kconfig is sourced from init/Kconfig now
* introduced bcid_t type with comment from Alan Cox
* check for barrier <= limit in sys_set_bclimit()
* removed (bc == NULL) checks
* replaced memcpy in beancounter_findcrate with assignment
* moved check 'if (mask & BC_ALLOC)' out of the lock
* removed unnecessary memset()

Summary of changes from v1 patch set:

* CONFIG_BEANCOUNTERS is 'n' by default
* fixed Kconfig includes in arches
* removed hierarchical beancounters to simplify first patchset
* removed unused 'private' pointer
* removed unused EXPORTS
* MAXVALUE redeclared as LONG_MAX
* beancounter_findcreate clarification
* renamed UBC -> BC, ub -> bc etc.
* moved BC inheritance into copy_process
* introduced reset_exec_bc() with proposed BUG_ON
* removed task_bc beancounter (not used yet, for numproc)
* fixed syscalls for sparc
* added sys_get_bcstat(): return info that was in /proc
* cond_syscall instead of #ifdefs

Many thanks to Oleg Nesterov, Alan Cox, Matt Helsley and others
for patch review and comments.

Patch set is applicable to 2.6.18-rc5-mm1

Thanks,
Kirill 

             reply	other threads:[~2006-09-05 14:59 UTC|newest]

Thread overview: 144+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-05 15:02 Kirill Korotaev [this message]
2006-09-05 15:19 ` [PATCH 1/13] BC: introduce atomic_dec_and_lock_irqsave() Kirill Korotaev
2006-09-05 15:20 ` [PATCH 2/13] BC: kconfig Kirill Korotaev
2006-09-05 15:21 ` [PATCH 3/17] BC: beancounters core (API) Kirill Korotaev
2006-09-05 15:23 ` [PATCH 4/13] BC: context inheriting and changing Kirill Korotaev
2006-09-05 15:24 ` [PATCH 5/13] BC: user interface (syscalls) Kirill Korotaev
2006-09-05 16:04   ` [ckrm-tech] " Balbir Singh
2006-09-06  8:29     ` Pavel Emelianov
2006-09-06  8:57       ` Balbir Singh
2006-09-06 10:42         ` Pavel Emelianov
2006-09-06 13:23           ` Balbir Singh
2006-09-06 13:45   ` Balbir Singh
2006-09-06 14:23     ` Kirill Korotaev
2006-09-05 15:25 ` [PATCH 6/13] BC: kernel memory (core) Kirill Korotaev
2006-09-05 15:26 ` [PATCH 7/13] BC: kernel memory (marks) Kirill Korotaev
2006-09-06 14:19   ` Cedric Le Goater
2006-09-05 15:27 ` [PATCH 8/13] BC: locked pages (core) Kirill Korotaev
2006-09-05 15:29 ` [PATCH 9/13] BC: locked pages (charge hooks) Kirill Korotaev
2006-09-06  3:43   ` Nick Piggin
2006-09-06  8:45     ` Pavel Emelianov
2006-09-06  9:41       ` Nick Piggin
2006-09-06 14:16     ` [ckrm-tech] " Kirill Korotaev
2006-09-05 15:30 ` [PATCH 10/13] BC: privvm pages Kirill Korotaev
2006-09-05 15:31 ` [PATCH 11/13] BC: vmrss (preparations) Kirill Korotaev
2006-09-05 22:09   ` Cedric Le Goater
2006-09-06 13:59     ` Kirill Korotaev
2006-09-07 16:28   ` [ckrm-tech] " Balbir Singh
2006-09-05 15:32 ` [PATCH 12/13] BC: vmrss (core) Kirill Korotaev
2006-09-05 15:33 ` [PATCH 13/13] BC: vmrss (charges) Kirill Korotaev
2006-09-05 16:53 ` [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) Balbir Singh
2006-09-06  8:34   ` Pavel Emelianov
2006-09-06 13:06   ` Kirill Korotaev
2006-09-06 19:17     ` Balbir Singh
2006-09-06 22:06       ` Chandra Seetharaman
2006-09-07  3:08         ` Balbir Singh
2006-09-08  7:33         ` Pavel Emelianov
2006-09-08 15:43           ` Dave Hansen
2006-09-08 18:26             ` Balbir Singh
2006-09-11  6:56               ` Pavel Emelianov
2006-09-11  7:54                 ` Balbir Singh
2006-09-11  8:13                   ` Pavel Emelianov
2006-09-11  8:19                     ` Balbir Singh
2006-09-12 10:39                       ` Pavel Emelianov
2006-09-12 10:40                       ` Pavel Emelianov
2006-09-12 12:01                         ` Balbir Singh
2006-09-13 13:39                           ` Pavel Emelianov
2006-09-11 10:21                     ` Srivatsa Vaddagiri
2006-09-12 10:35                       ` Pavel Emelianov
2006-09-11 18:49                     ` Chandra Seetharaman
2006-09-12 10:48                       ` Pavel Emelianov
2006-09-12 23:58                         ` Chandra Seetharaman
2006-09-13  8:06                           ` Pavel Emelianov
2006-09-13 12:15                             ` Srivatsa Vaddagiri
2006-09-13 13:35                               ` Pavel Emelianov
2006-09-13 22:31                             ` Chandra Seetharaman
2006-09-14  7:53                               ` Pavel Emelianov
2006-09-14  8:06                                 ` Balbir Singh
2006-09-14 13:02                                   ` Pavel Emelianov
2006-09-15  0:02                                     ` Chandra Seetharaman
2006-09-15  7:21                                       ` Pavel Emelianov
2006-09-15  8:49                                       ` Kirill Korotaev
2006-09-18 23:51                                         ` Chandra Seetharaman
2006-09-14 23:42                                 ` Chandra Seetharaman
2006-09-15  7:15                                   ` Pavel Emelianov
2006-09-15  8:55                                     ` Kirill Korotaev
2006-09-15 11:15                                       ` Pavel Emelianov
2006-09-18  8:25                                         ` Balbir Singh
2006-09-18  8:56                                           ` Pavel Emelianov
2006-09-18 11:20                                             ` Balbir Singh
2006-09-18 11:32                                               ` Pavel Emelianov
2006-09-19  0:05                                             ` Chandra Seetharaman
2006-09-19  8:04                                               ` Pavel Emelianov
2006-09-18 11:27                                         ` Balbir Singh
2006-09-18 12:37                                           ` Pavel Emelianov
2006-09-19  0:08                                             ` Chandra Seetharaman
2006-09-19  8:06                                               ` Pavel Emelianov
2006-09-18 23:48                                     ` Chandra Seetharaman
2006-09-11 18:44                 ` Chandra Seetharaman
2006-09-15  8:57                   ` Kirill Korotaev
2006-09-18 23:57                     ` Chandra Seetharaman
2006-09-08 19:23           ` Chandra Seetharaman
2006-09-08 21:43             ` Rohit Seth
2006-09-11 18:25               ` Chandra Seetharaman
2006-09-11 19:10                 ` Rohit Seth
2006-09-11 19:42                   ` Chandra Seetharaman
2006-09-11 23:58                     ` Rohit Seth
2006-09-12  9:53                       ` Balbir Singh
2006-09-12 23:54                       ` Chandra Seetharaman
2006-09-13  0:39                         ` Rohit Seth
2006-09-13  1:10                           ` Chandra Seetharaman
2006-09-13  1:25                             ` Rohit Seth
2006-09-13  4:41                               ` Srivatsa Vaddagiri
2006-09-13 22:20                               ` Chandra Seetharaman
2006-09-14  1:22                                 ` Rohit Seth
2006-09-14 23:13                                   ` Chandra Seetharaman
2006-09-11 19:48                   ` [Devel] " Kir Kolyshkin
2006-09-12  0:28                     ` Rohit Seth
2006-09-12 10:44                   ` Srivatsa Vaddagiri
2006-09-12 17:22                     ` Rohit Seth
2006-09-12 17:40                       ` Srivatsa Vaddagiri
2006-09-12 18:02                         ` Rohit Seth
2006-09-13  0:02                       ` Chandra Seetharaman
2006-09-13  0:43                         ` Rohit Seth
2006-09-13  1:13                           ` Chandra Seetharaman
2006-09-13  1:33                             ` Rohit Seth
2006-09-13 22:24                               ` Chandra Seetharaman
2006-09-14  1:27                                 ` Rohit Seth
2006-09-14 23:28                                   ` Chandra Seetharaman
2006-09-15  9:26                                 ` Kirill Korotaev
2006-09-15 16:52                                   ` Rohit Seth
2006-09-15 21:21                                     ` [Devel] " Kir Kolyshkin
2006-09-15 21:58                                       ` Rohit Seth
2006-09-19  0:02                                       ` [ckrm-tech] [Devel] " Chandra Seetharaman
2006-09-18 23:59                                   ` [ckrm-tech] " Chandra Seetharaman
2006-09-06 21:47     ` Chandra Seetharaman
2006-09-08 15:33     ` Dave Hansen
2006-09-08 15:57     ` [RFC] Add tgid aggregation to beancounters (was Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory)) Balbir Singh
2006-09-11 21:24       ` V2: " Balbir Singh
2006-09-15 16:40         ` [ckrm-tech] V2: Add tgid aggregation to beancounters (was " Kirill Korotaev
2006-10-09  8:23           ` Balbir Singh
2006-09-05 17:46 ` [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) Dave Hansen
2006-09-05 18:28   ` Balbir Singh
2006-09-06  0:17   ` Rohit Seth
2006-09-08 15:30     ` Dave Hansen
2006-09-08 17:10       ` Rohit Seth
2006-09-08 17:26         ` Shailabh Nagar
2006-09-08 21:15           ` Rohit Seth
2006-09-08 21:28             ` Shailabh Nagar
2006-09-06 13:57   ` Kirill Korotaev
2006-09-06 21:54     ` Chandra Seetharaman
2006-09-07  7:29       ` Pavel Emelianov
2006-09-07 19:16         ` Chandra Seetharaman
2006-09-08  7:22           ` Pavel Emelianov
2006-09-08 19:07             ` Chandra Seetharaman
2006-09-11  7:02               ` Pavel Emelianov
2006-09-11 13:04                 ` Srivatsa Vaddagiri
2006-09-12 10:24                   ` Pavel Emelianov
2006-09-12 10:29                     ` Srivatsa Vaddagiri
2006-09-12 11:06                       ` Pavel Emelianov
2006-09-12 12:04                         ` Srivatsa Vaddagiri
2006-09-11 18:47                 ` Chandra Seetharaman
2006-09-07 19:29         ` Chandra Seetharaman
2006-09-08  7:26           ` Pavel Emelianov
2006-09-08 19:10             ` Chandra Seetharaman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44FD918A.7050501@sw.ru \
    --to=dev@sw.ru \
    --cc=adobriyan@mail.ru \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=ckrm-tech@lists.sourceforge.net \
    --cc=devel@openvz.org \
    --cc=hch@infradead.org \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthltc@us.ibm.com \
    --cc=oleg@tv-sign.ru \
    --cc=riel@redhat.com \
    --cc=saw@sw.ru \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.