From: Dave Hansen <dave@linux.vnet.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: mpm@selenic.com, containers@lists.linux-foundation.org,
hpa@zytor.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
viro@zeniv.linux.org.uk, linux-api@vger.kernel.org,
mingo@elte.hu, torvalds@linux-foundation.org, tglx@linutronix.de,
xemul@openvz.org, Alexey Dobriyan <adobriyan@gmail.com>
Subject: How much of a mess does OpenVZ make? ;) Was: What can OpenVZ do?
Date: Thu, 12 Feb 2009 15:04:05 -0800 [thread overview]
Message-ID: <1234479845.30155.220.camel@nimitz> (raw)
In-Reply-To: <20090212141014.2cd3d54d.akpm@linux-foundation.org>
On Thu, 2009-02-12 at 14:10 -0800, Andrew Morton wrote:
> On Thu, 12 Feb 2009 13:51:23 -0800
> Dave Hansen <dave@linux.vnet.ibm.com> wrote:
>
> > On Thu, 2009-02-12 at 11:42 -0800, Andrew Morton wrote:
> > > On Thu, 12 Feb 2009 13:30:35 -0600
> > > Matt Mackall <mpm@selenic.com> wrote:
> > >
> > > > On Thu, 2009-02-12 at 10:11 -0800, Dave Hansen wrote:
> > > >
> > > > > > - In bullet-point form, what features are missing, and should be added?
> > > > >
> > > > > * support for more architectures than i386
> > > > > * file descriptors:
> > > > > * sockets (network, AF_UNIX, etc...)
> > > > > * devices files
> > > > > * shmfs, hugetlbfs
> > > > > * epoll
> > > > > * unlinked files
> > > >
> > > > > * Filesystem state
> > > > > * contents of files
> > > > > * mount tree for individual processes
> > > > > * flock
> > > > > * threads and sessions
> > > > > * CPU and NUMA affinity
> > > > > * sys_remap_file_pages()
> > > >
> > > > I think the real questions is: where are the dragons hiding? Some of
> > > > these are known to be hard. And some of them are critical checkpointing
> > > > typical applications. If you have plans or theories for implementing all
> > > > of the above, then great. But this list doesn't really give any sense of
> > > > whether we should be scared of what lurks behind those doors.
> > >
> > > How close has OpenVZ come to implementing all of this? I think the
> > > implementatation is fairly complete?
> >
> > I also believe it is "fairly complete". At least able to be used
> > practically.
> >
> > > If so, perhaps that can be used as a guide. Will the planned feature
> > > have a similar design? If not, how will it differ? To what extent can
> > > we use that implementation as a tool for understanding what this new
> > > implementation will look like?
> >
> > Yes, we can certainly use it as a guide. However, there are some
> > barriers to being able to do that:
> >
> > dave@nimitz:~/kernels/linux-2.6-openvz$ git diff v2.6.27.10... | diffstat | tail -1
> > 628 files changed, 59597 insertions(+), 2927 deletions(-)
> > dave@nimitz:~/kernels/linux-2.6-openvz$ git diff v2.6.27.10... | wc
> > 84887 290855 2308745
> >
> > Unfortunately, the git tree doesn't have that great of a history. It
> > appears that the forward-ports are just applications of huge single
> > patches which then get committed into git. This tree has also
> > historically contained a bunch of stuff not directly related to
> > checkpoint/restart like resource management.
> >
> > We'd be idiots not to take a hard look at what has been done in OpenVZ.
> > But, for the time being, we have absolutely no shortage of things that
> > we know are important and know have to be done. Our largest problem is
> > not finding things to do, but is our large out-of-tree patch that is
> > growing by the day. :(
> >
>
> Well we have a chicken-and-eggish thing. The patchset will keep
> growing until we understand how much of this:
>
> > dave@nimitz:~/kernels/linux-2.6-openvz$ git diff v2.6.27.10... | diffstat | tail -1
> > 628 files changed, 59597 insertions(+), 2927 deletions(-)
>
> we will be committed to if we were to merge the current patchset.
Here's the measurement that Alexey suggested:
dave@nimitz:~/kernels/linux-2.6-openvz$ git diff v2.6.27.10... kernel/cpt/ | diffstat
Makefile | 53 +
cpt_conntrack.c | 365 ++++++++++++
cpt_context.c | 257 ++++++++
cpt_context.h | 215 +++++++
cpt_dump.c | 1250 ++++++++++++++++++++++++++++++++++++++++++
cpt_dump.h | 16
cpt_epoll.c | 113 +++
cpt_exports.c | 13
cpt_files.c | 1626 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
cpt_files.h | 71 ++
cpt_fsmagic.h | 16
cpt_inotify.c | 144 ++++
cpt_kernel.c | 177 ++++++
cpt_kernel.h | 99 +++
cpt_mm.c | 923 +++++++++++++++++++++++++++++++
cpt_mm.h | 35 +
cpt_net.c | 614 ++++++++++++++++++++
cpt_net.h | 7
cpt_obj.c | 162 +++++
cpt_obj.h | 62 ++
cpt_proc.c | 595 ++++++++++++++++++++
cpt_process.c | 1369 ++++++++++++++++++++++++++++++++++++++++++++++
cpt_process.h | 13
cpt_socket.c | 790 ++++++++++++++++++++++++++
cpt_socket.h | 33 +
cpt_socket_in.c | 450 +++++++++++++++
cpt_syscalls.h | 101 +++
cpt_sysvipc.c | 403 +++++++++++++
cpt_tty.c | 215 +++++++
cpt_ubc.c | 132 ++++
cpt_ubc.h | 23
cpt_x8664.S | 67 ++
rst_conntrack.c | 283 +++++++++
rst_context.c | 323 ++++++++++
rst_epoll.c | 169 +++++
rst_files.c | 1648 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
rst_inotify.c | 196 ++++++
rst_mm.c | 1151 +++++++++++++++++++++++++++++++++++++++
rst_net.c | 741 +++++++++++++++++++++++++
rst_proc.c | 580 +++++++++++++++++++
rst_process.c | 1640 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
rst_socket.c | 918 +++++++++++++++++++++++++++++++
rst_socket_in.c | 489 ++++++++++++++++
rst_sysvipc.c | 633 +++++++++++++++++++++
rst_tty.c | 384 +++++++++++++
rst_ubc.c | 131 ++++
rst_undump.c | 1007 ++++++++++++++++++++++++++++++++++
47 files changed, 20702 insertions(+)
One important thing that leaves out is the interaction that this code
has with the rest of the kernel. That's critically important when
considering long-term maintenance, and I'd be curious how the OpenVZ
folks view it.
> Now, we've gone in blind before - most notably on the
> containers/cgroups/namespaces stuff. That hail mary pass worked out
> acceptably, I think. Maybe we got lucky. I thought that
> net-namespaces in particular would never get there, but it did.
>
> That was a very large and quite long-term-important user-visible
> feature.
>
> checkpoint/restart/migration is also a long-term-...-feature. But if
> at all possible I do think that we should go into it with our eyes a
> little less shut.
One thing Ingo has asked for that I understand a bit more clearly is a
programmatic statement of what is and is not covered by this current
code. That's certainly one eye-opening activity which I'll get to
immediately.
-- Dave
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-02-12 23:04 UTC|newest]
Thread overview: 121+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-27 17:07 [RFC v13][PATCH 00/14] Kernel based checkpoint/restart Oren Laadan
2009-01-27 17:07 ` [RFC v13][PATCH 01/14] Create syscalls: sys_checkpoint, sys_restart Oren Laadan
2009-01-27 17:20 ` Randy Dunlap
2009-01-27 17:08 ` [RFC v13][PATCH 02/14] Checkpoint/restart: initial documentation Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 03/14] Make file_pos_read/write() public Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 04/14] General infrastructure for checkpoint restart Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 05/14] x86 support for checkpoint/restart Oren Laadan
2009-02-24 7:47 ` Nathan Lynch
[not found] ` <20090224014739.1b82fc35-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
2009-02-24 16:06 ` Dave Hansen
2009-03-18 7:21 ` Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 06/14] Dump memory address space Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 07/14] Restore " Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 08/14] Infrastructure for shared objects Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 09/14] Dump open file descriptors Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 11/14] External checkpoint of a task other than ourself Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 13/14] Checkpoint multiple processes Oren Laadan
[not found] ` <1233076092-8660-1-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-01-27 17:08 ` [RFC v13][PATCH 10/14] Restore open file descriprtors Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 12/14] Track in-kernel when we expect checkpoint/restart to work Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 14/14] Restart multiple processes Oren Laadan
2009-02-10 17:05 ` [RFC v13][PATCH 00/14] Kernel based checkpoint/restart Dave Hansen
2009-02-11 22:14 ` Andrew Morton
2009-02-12 9:17 ` Ingo Molnar
[not found] ` <20090212091721.GB1888-X9Un+BFzKDI@public.gmane.org>
2009-02-12 18:11 ` Dave Hansen
2009-02-12 20:48 ` Serge E. Hallyn
2009-02-13 10:20 ` Ingo Molnar
2009-02-12 18:11 ` Dave Hansen
2009-02-12 19:30 ` Matt Mackall
2009-02-12 19:42 ` Andrew Morton
2009-02-12 21:51 ` What can OpenVZ do? Dave Hansen
2009-02-12 22:10 ` Andrew Morton
2009-02-12 23:04 ` Dave Hansen [this message]
2009-02-26 15:57 ` How much of a mess does OpenVZ make? ;) Was: " Alexey Dobriyan
2009-03-10 21:53 ` Alexey Dobriyan
2009-03-10 23:28 ` Serge E. Hallyn
2009-03-11 8:26 ` Cedric Le Goater
2009-03-12 14:53 ` Serge E. Hallyn
2009-03-12 21:01 ` Greg Kurz
2009-03-12 21:21 ` Serge E. Hallyn
2009-03-13 4:29 ` Ying Han
2009-03-13 5:34 ` Sukadev Bhattiprolu
[not found] ` <20090313053458.GA28833-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-13 6:19 ` Ying Han
2009-03-13 17:27 ` Linus Torvalds
2009-03-13 19:02 ` Serge E. Hallyn
[not found] ` <alpine.LFD.2.00.0903131018390.3940-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2009-03-13 19:35 ` Alexey Dobriyan
2009-03-13 21:01 ` Linus Torvalds
2009-03-13 21:51 ` Dave Hansen
2009-03-13 22:15 ` Oren Laadan
2009-03-14 0:27 ` Eric W. Biederman
2009-03-14 8:12 ` Ingo Molnar
2009-03-16 22:33 ` Kevin Fox
2009-03-19 21:19 ` Eric W. Biederman
[not found] ` <alpine.LFD.2.00.0903131401070.3940-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2009-03-14 0:20 ` Alexey Dobriyan
2009-03-14 8:25 ` Ingo Molnar
[not found] ` <20090314082532.GB16436-X9Un+BFzKDI@public.gmane.org>
2009-03-14 17:11 ` Joseph Ruscio
2009-03-16 6:01 ` Oren Laadan
2009-03-13 20:48 ` Mike Waychison
2009-03-13 22:35 ` Oren Laadan
2009-03-18 18:54 ` Mike Waychison
2009-03-18 19:04 ` Oren Laadan
[not found] ` <604427e00903122129y37ad791aq5fe7ef2552415da9-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-03-13 15:27 ` Cedric Le Goater
[not found] ` <49BA7B60.60607-GANU6spQydw@public.gmane.org>
2009-03-13 17:11 ` Greg Kurz
2009-03-13 17:37 ` Serge E. Hallyn
2009-03-13 15:47 ` Cedric Le Goater
2009-03-13 16:35 ` Serge E. Hallyn
2009-03-13 16:53 ` Cedric Le Goater
2009-02-26 16:27 ` Alexey Dobriyan
2009-02-26 17:33 ` Ingo Molnar
[not found] ` <20090226173302.GB29439-X9Un+BFzKDI@public.gmane.org>
2009-02-26 18:30 ` Greg Kurz
2009-02-26 22:17 ` Alexey Dobriyan
[not found] ` <20090226221709.GA2924-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-02-27 9:19 ` Greg Kurz
2009-02-27 10:53 ` Alexey Dobriyan
2009-02-27 14:33 ` Cedric Le Goater
2009-02-27 9:36 ` Cedric Le Goater
2009-02-26 22:31 ` Alexey Dobriyan
2009-02-27 9:03 ` Ingo Molnar
2009-02-27 9:19 ` Andrew Morton
2009-02-27 10:57 ` Alexey Dobriyan
[not found] ` <20090227090323.GC16211-X9Un+BFzKDI@public.gmane.org>
2009-02-27 9:22 ` Andrew Morton
2009-02-27 10:59 ` Alexey Dobriyan
2009-02-27 16:14 ` Dave Hansen
2009-02-27 21:57 ` Alexey Dobriyan
[not found] ` <20090227215749.GA3453-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-02-27 21:54 ` Dave Hansen
[not found] ` <20090226223112.GA2939-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-01 1:33 ` Alexey Dobriyan
[not found] ` <20090301013304.GA2428-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-01 20:02 ` Serge E. Hallyn
[not found] ` <20090301200231.GA25276-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-01 20:56 ` Alexey Dobriyan
2009-03-01 22:21 ` Serge E. Hallyn
2009-03-03 16:17 ` Cedric Le Goater
2009-03-03 18:28 ` Serge E. Hallyn
2009-02-13 10:53 ` Ingo Molnar
[not found] ` <20090213105302.GC4608-X9Un+BFzKDI@public.gmane.org>
2009-02-16 20:51 ` Dave Hansen
2009-02-17 22:23 ` Ingo Molnar
[not found] ` <20090217222319.GA10546-X9Un+BFzKDI@public.gmane.org>
2009-02-17 22:30 ` Dave Hansen
2009-02-18 0:32 ` Ingo Molnar
2009-02-18 0:40 ` Dave Hansen
2009-02-18 5:11 ` Alexey Dobriyan
2009-02-18 18:16 ` Ingo Molnar
[not found] ` <20090218181644.GD19995-X9Un+BFzKDI@public.gmane.org>
2009-02-18 21:27 ` Dave Hansen
2009-02-18 23:15 ` Ingo Molnar
2009-02-19 19:06 ` Banning checkpoint (was: Re: What can OpenVZ do?) Alexey Dobriyan
2009-02-19 19:11 ` Dave Hansen
2009-02-24 4:47 ` Alexey Dobriyan
[not found] ` <20090224044752.GB3202-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-02-24 5:11 ` Dave Hansen
2009-02-24 15:43 ` Serge E. Hallyn
2009-02-24 20:09 ` Alexey Dobriyan
2009-02-12 22:17 ` What can OpenVZ do? Alexey Dobriyan
2009-02-13 10:27 ` Ingo Molnar
2009-02-13 11:32 ` Alexey Dobriyan
2009-02-13 11:45 ` Ingo Molnar
2009-02-13 22:28 ` Alexey Dobriyan
2009-03-14 0:04 ` Eric W. Biederman
2009-03-14 0:26 ` Serge E. Hallyn
2009-02-12 22:57 ` [RFC v13][PATCH 00/14] Kernel based checkpoint/restart Dave Hansen
2009-02-12 23:05 ` Matt Mackall
2009-02-12 23:13 ` Dave Hansen
2009-02-13 23:28 ` Andrew Morton
2009-02-14 23:08 ` Ingo Molnar
2009-02-14 23:31 ` Andrew Morton
2009-02-14 23:50 ` Ingo Molnar
[not found] ` <20090213152836.0fbbfa7d.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-02-16 17:37 ` Dave Hansen
2009-03-13 2:45 ` Oren Laadan
2009-03-13 3:57 ` Oren Laadan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1234479845.30155.220.camel@nimitz \
--to=dave@linux.vnet.ibm.com \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=containers@lists.linux-foundation.org \
--cc=hpa@zytor.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
--cc=mpm@selenic.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).