All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rohit Seth <rohitseth-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
To: "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
Cc: ckrm-tech-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org,
	xemul-3ImXcnM4P+0@public.gmane.org,
	pj-sJ/iWh9BUns@public.gmane.org,
	cpw-sJ/iWh9BUns@public.gmane.org,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org,
	containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org,
	menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org
Subject: Re: containers development plans (July 20 version)
Date: Fri, 20 Jul 2007 14:29:38 -0700	[thread overview]
Message-ID: <1184966978.10091.448.camel@galaxy.corp.google.com> (raw)
In-Reply-To: <20070720173615.GA25167-4+H9nbzk0uAIagZqoN9o3w@public.gmane.org>

Thanks Serge for collecting these requirements.  Have we decided on
container mini summit?  Couple of points that I want to add in task
container functionality section (not sure if these are already covered
by items below):

1- Per container dirty page (write throttling) limit.
2- Per container memory reclaim
3- network rate limiting (outbound) based on container
4- User level APIS to identify the resource limits that is allowed to a
job, for example, how much physical memory a process can use.  This
should seamlessly integrated with non-container environment as well (may
be with ulimit).
5- Similary, per container stats, like pages on active list, cpus usage
etc. could also be very helpful.


Thanks,
-rohit

On the taskOn Fri, 2007-07-20 at 12:36 -0500, Serge E. Hallyn wrote:
> (If you missed earlier parts of this thread, you can catch earlier parts of
> this thread starting at
> https://lists.linux-foundation.org/pipermail/containers/2007-July/005860.html)
> 
> ======================  Section 0  ======================
> =Status of this document
> ======================  Section 0  ======================
> 
> I've added a 'use cases' section.  That is where we attempt to
> explain to people not familiar with containers work why it is
> worth integrating upstream.
> 
> Srivatsa Vaddagiri is independently gathering additional information
> on specific task container subsystems.  That will eventually be
> incorporated into the final version of this roadmap.
> 
> ======================  Section 1  ======================
> =Introduction
> ======================  Section 1  ======================
> 
> We are trying to create a roadmap for the next year of
> 'container' development, to be reported to the upcoming kernel
> summit.  Containers here is a bit of an ambiguous term, so we are
> taking it to mean all of:
> 
> 	1. namespaces
>                 kernel resource namespaces to support resource isolation
>                 and virtualization for virtual servers and application
>                 checkpoint/restart.
> 	2. task containers framework
>                 the task containers (or, as Paul Jackson suggests, resource
>                 containers) framework by Paul Menage which especially
>                 provides a framework for subsystems which perform resource
>                 accounting and limits.
> 	3. checkpoint/restart
> 
> ======================  Section 2  ======================
> =Detailed development plans
> ======================  Section 2  ======================
> 
> A (still under construction) list of features we expect to be worked on
> next year looks like this:
> 
>         1. completion of ongoing namespaces
>                 pid namespace
>                         push merged patchset upstream
>                         kthread cleanup
>                                 especially nfs
>                                 autofs
>                         af_unix credentials (stores pid_t?)
>                 net namespace
>                 ro bind mounts
>         2. continuation with new namespaces
>                 devpts, console, and ttydrivers
>                 user
>                 time
>                 namespace management tools
>                 namespace entering  (using one of:)
>                         bind_ns()
>                         ns container subsystem
>                         (vs refuse this functionality)
>                 multiple /sys mounts
>                         break /sys into smaller chunks?
>                         shadow dirs vs namespaces
>                 multiple proc mounts
>                         likely need to extend on the work done for pid namespaces
>                         i.e. other /proc files will need some care
> 				virtualization of statistics for 'top', etc
>         3. any additional work needed for virtual servers?
>                 i.e. in-kernel keyring usage for cross-usernamespace permissions, etc
>                         nfs and rpc updates needed?
>                         general security fixes
>                                 per-container capabilities?
>                         device access controls
>                                 e.g. root in container should not have access to /dev/sda by default)
>                         filesystems access controls
> 
>         4. task containers functionality
>                 base features
>                         virtualized continerfs mounts
>                                 to support vserver mgmnt of sub-containers
>                         locking cleanup
>                         control file API simplification
>                         control file prefixing with subsystem name
> 		userpace RBCE to provide controls for
> 			users
> 			groups
> 			pgrp
> 			executable
>                 specific containers
>                         split cpusets into
>                                 cpuset
>                                 memset
>                         network
>                                 connect/bind/accept controller using iptables
>                         network flow id control
>                         userspace per-container OOM handler
> 			per-container swap
> 			per-container disk I/O scheduling
> 
>         5. checkpoint/restart
>                 memory c/r
>                         (there are a few designs and prototypes)
>                         (though this may be ironed out by then)
>                         per-container swapfile?
>                 overall checkpoint strategy  (one of:)
>                         in-kernel
>                         userspace-driven
>                         hybrid
>                 overall restart strategy
>                 use freezer API
>                 use suspend-to-disk?
>                 sysvipc
>                         "set identifier" syscall
> 		pid namespace
>                         clone_with_pid()
> 
> 
> ======================  Section 3  ======================
> =Use cases
> ======================  Section 3  ======================
> 
> 	1, Namespaces:
> 
> 	The most commonly listed uses for namespaces are virtual
> 	servers and checkpoint restart.  Other uses are debugging
> 	(running tests in not-quite-virtual-servers) and resource
> 	isolation, such as the use of mounts namespaces to simulate
> 	multi-level directories for LSPP.
> 
> 	2. Task Containers:
> 
> 	(Vatsa to fill in)
> 
> 	3. Checkpoint/restart
> 
> 	load balancing:
> 	applications can be migrated from high-load systems to ones
> 	with a lower load.  Long-running applications can be checkpointed
> 	(or migrated) to start a short-running high-load job, then
> 	restarted.
> 
> 	kernel upgrades:
> 	A long-running application - or whole virtual server - can
> 	be migrated or checkpointed so that the system can be
> 	rebooted, and the application can continue to run
> 
> 
> ======================  Section 4  ======================
> =Involved parties
> ======================  Section 4  ======================
> 
> In the list of stakeholders, I try to guess based on past comments and
> contributions what *general* area they are most likely to contribute in.
> I may try to narrow those down later, but am just trying to get something
> out the door right now before my next computer breaks.
> 
> Stakeholders:
>         Eric Biederman
>                 everything
>         google
>                 task containers
>         ibm (serge, dave, cedric, daniel)
>                 namespaces
> 		checkpoint/restart
> 	bull (benjamin, pierre)
>                 namespaces
> 		checkpoint/restart
>         ibm (balbir, vatsa)
> 		task containers
>         kerlabs
>                 checkpoint/restart
>         openvz
>                 everything
>         NEC Japan (Masahiko Takahashi)
>                 checkpoint/restart
>         Linux-VServer
>                 namespaces+containers
>         zap project
>                 checkpoint/restart
>         planetlab
>                 everything
>         hp
>                 (i must have lost an email - what are they
> 		interested in working on?)
>         XtreemOS
>                 checkpoint/restart
> 	Fujitsu/VA Linux Japan
> 		resource control
> 
> Is anyone else still missing from the list?
> 
> thanks,
> -serge

  parent reply	other threads:[~2007-07-20 21:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-20 17:36 containers development plans (July 20 version) Serge E. Hallyn
     [not found] ` <20070720173615.GA25167-4+H9nbzk0uAIagZqoN9o3w@public.gmane.org>
2007-07-20 21:29   ` Rohit Seth [this message]
     [not found]     ` <1184966978.10091.448.camel-7OsMPKyG+FJSzHKm+aFRNNkmqwFzkYv6@public.gmane.org>
2007-07-23 14:27       ` Serge E. Hallyn
2007-07-21  0:02   ` Paul Menage
     [not found]     ` <6599ad830707201702g1acae7ddt16a502984baa64cc-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-07-21  1:00       ` Eric W. Biederman
     [not found]         ` <m1y7ha1sg2.fsf-T1Yj925okcoyDheHMi7gv2pdwda3JcWeAL8bYrjMMd8@public.gmane.org>
2007-07-23 14:24           ` Serge E. Hallyn
2007-07-23 14:00   ` Cedric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1184966978.10091.448.camel@galaxy.corp.google.com \
    --to=rohitseth-hpiqsd4aklfqt0dzr+alfa@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=ckrm-tech-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    --cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
    --cc=cpw-sJ/iWh9BUns@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=pj-sJ/iWh9BUns@public.gmane.org \
    --cc=serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org \
    --cc=xemul-3ImXcnM4P+0@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.