* trying to build simple checkpoint/restart recipes
@ 2010-12-08 4:53 Serge E. Hallyn
[not found] ` <20101208045322.GA17602-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Serge E. Hallyn @ 2010-12-08 4:53 UTC (permalink / raw)
To: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
What I've done so far:
created a KVM vm and installed up-to-date maverick
add-apt-repository ppa:appcr/ppa
apt-get update && apt-get dist-upgrade
apt-get install libvirt-bin lxc linux-image-2.6.34-1cr4
sed -i 's/GRUB_DEFAULT=0/GRUB_DEFAULT="Ubuntu, with Linux 2.6.34-1cr4-generic"/' /etc/default/grub
update-grub
replaced 122 with 123 in /etc/libvirt/qemu/networks/default.xml and /var/lib/libvirt/network/default.xml
reboot
# The following should go into an upstart script shipped with the appcr packages
# as they must be done on each boot
chmod 666 /dev/pts/ptmx
rm /dev/ptmx
ln -s /dev/pts/ptmx /dev/ptmx
mkdir -p /cgroup
mount -t cgroup cggroup /cgroup/
echo /bin/remove_dead_cgroup.sh > /cgroup/release_agent
echo 1 > /cgroup/notify_on_release
#
cat > /etc/lxc-basic.conf << EOF
lxc.network.type=veth
lxc.network.link=virbr0
lxc.network.flags=up
EOF
lxc-create -f /etc/lxc-basic.conf -n cr1 -t ubuntu
cd /var/lib/lxc/cr1/rootfs/sbin
mv init upstart
cat > init << EOF
#!/bin/sh
rm -f /shutdown
hostname cr1
exec 0<&-
exec 0</dev/null
exec 1>&-
exec 1>nohup.out
exec 2>&-
exec 2>nohup.out
mkdir -p /tmp2
mount --bind /tmp2 /tmp
mount -a
mount -t proc proc /proc
mount -t tmpfs varrun /var/run
mkdir /var/run/network
mkdir /var/run/sshd
ifconfig eth0 192.168.123.21 up
screen -A -d -m -S console
/usr/sbin/sshd
while [ ! -f /shutdown ]; do
sleep 4s
done
EOF
lxc-start -n cr1
(in another console)
ssh 192.168.123.21
screen -r
ps
ctrl-a d
exit
lxc-freeze -n cr1
lxc-checkout -n cr1 -S /root/cr1.s1
So far, so good. Note that I couldn't use upstart for my init bc upstart
uses inotify, which we don't yet checkpoint. The kernel is compiled without
ipv6 bc that was also causing a problem (though I thought ipv6 was supported
for checkpoint?) and therefore I needed a custom libvirt package which didn't
break when ipv6 is not there.
The problem now is when attempting to restart:
lxc-stop -n cr1
lxc-restart -n cr1 -S /root/cr1.s1
There are two issues:
1. how to re-create the mounts. Kernel doesn't do it yet. There
isn't (that I know of) a clean way to hook lxc-restart to do it.
Comments?
2. likewise there *may* end up being a question of where to best hook
the backup/snapshot and restore of filesystems. Though for these
examples (screen and next vncserver) it shouldn't be necessary.
I'm trying to do this using lxc-checkpoint and lxc-restart so as to
keep the instructions as simple as possible. The hope is in the next
few weeks to have a few recipes that people can try out. But if it's
not (cleanly) possible using lxc-restart, then I guess I can switch to
using user-cr and some container tarballs, which seems a lot easier in
the short term but less useful in the long term.
thanks,
-serge
^ permalink raw reply [flat|nested] 5+ messages in thread[parent not found: <20101208045322.GA17602-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>]
* Re: trying to build simple checkpoint/restart recipes [not found] ` <20101208045322.GA17602-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> @ 2010-12-08 5:53 ` Matt Helsley [not found] ` <20101208055320.GH10470-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Matt Helsley @ 2010-12-08 5:53 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA On Wed, Dec 08, 2010 at 04:53:22AM +0000, Serge E. Hallyn wrote: > What I've done so far: > > created a KVM vm and installed up-to-date maverick > add-apt-repository ppa:appcr/ppa > apt-get update && apt-get dist-upgrade > apt-get install libvirt-bin lxc linux-image-2.6.34-1cr4 > sed -i 's/GRUB_DEFAULT=0/GRUB_DEFAULT="Ubuntu, with Linux 2.6.34-1cr4-generic"/' /etc/default/grub > update-grub > > replaced 122 with 123 in /etc/libvirt/qemu/networks/default.xml and /var/lib/libvirt/network/default.xml > reboot > > # The following should go into an upstart script shipped with the appcr packages > # as they must be done on each boot > chmod 666 /dev/pts/ptmx > rm /dev/ptmx > ln -s /dev/pts/ptmx /dev/ptmx > mkdir -p /cgroup > mount -t cgroup cggroup /cgroup/ > echo /bin/remove_dead_cgroup.sh > /cgroup/release_agent > echo 1 > /cgroup/notify_on_release > # > > cat > /etc/lxc-basic.conf << EOF > lxc.network.type=veth > lxc.network.link=virbr0 > lxc.network.flags=up > EOF > > lxc-create -f /etc/lxc-basic.conf -n cr1 -t ubuntu > cd /var/lib/lxc/cr1/rootfs/sbin > mv init upstart > > cat > init << EOF > #!/bin/sh > rm -f /shutdown > hostname cr1 > > exec 0<&- > exec 0</dev/null > exec 1>&- > exec 1>nohup.out > exec 2>&- > exec 2>nohup.out > > mkdir -p /tmp2 > mount --bind /tmp2 /tmp > > mount -a > mount -t proc proc /proc > mount -t tmpfs varrun /var/run > mkdir /var/run/network > mkdir /var/run/sshd > ifconfig eth0 192.168.123.21 up > screen -A -d -m -S console > > /usr/sbin/sshd > while [ ! -f /shutdown ]; do > sleep 4s > done > EOF > > lxc-start -n cr1 > > (in another console) > ssh 192.168.123.21 > screen -r > ps > ctrl-a d > exit > > lxc-freeze -n cr1 > lxc-checkout -n cr1 -S /root/cr1.s1 > > So far, so good. Note that I couldn't use upstart for my init bc upstart > uses inotify, which we don't yet checkpoint. The kernel is compiled without Interesting, I didn't know that. What does upstart use inotify for? > ipv6 bc that was also causing a problem (though I thought ipv6 was supported > for checkpoint?) and therefore I needed a custom libvirt package which didn't > break when ipv6 is not there. > > The problem now is when attempting to restart: > > lxc-stop -n cr1 > lxc-restart -n cr1 -S /root/cr1.s1 > > There are two issues: > > 1. how to re-create the mounts. Kernel doesn't do it yet. There > isn't (that I know of) a clean way to hook lxc-restart to do it. > Comments? It's incomplete but I think you can save the most important portions of a mount namespace with a simple 1-line command: lxc-attach -n cr1 cat /proc/self/mountinfo > cr1.mountinfo It's incomplete because: 1. It does not adequately address cross-mount-ns bind mounts (IIRC). 2. It won't work for nested containers (though I don't know if lxc supports this already it's not *too* far fetched to expect folks will ask for it in the future). We can extend the hack to deal with this by making a small change in sys_checkpoint but I can't see how to fix #1 without doing it all in-kernel anyway. The restoration of the mounts is not scriptable however. It involves parsing the mountinfo file and coordinating the mounts with those done by lxc itself during lxc-restart. I honestly haven't looked at that closely enough yet to say how pretty/ugly that'd be but it entails modifications to lxc-restart itself. And since #1 above would still be an issue I'm not sure it's worth doing it that way. Cheers, -Matt Helsley ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <20101208055320.GH10470-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>]
* Re: trying to build simple checkpoint/restart recipes [not found] ` <20101208055320.GH10470-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org> @ 2010-12-08 14:52 ` Serge E. Hallyn [not found] ` <20101208145245.GB8316-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Serge E. Hallyn @ 2010-12-08 14:52 UTC (permalink / raw) To: Matt Helsley; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA Quoting Matt Helsley (matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org): > > So far, so good. Note that I couldn't use upstart for my init bc upstart > > uses inotify, which we don't yet checkpoint. The kernel is compiled without > > Interesting, I didn't know that. What does upstart use inotify for? Dunno :) I was quite put out though. > > There are two issues: > > > > 1. how to re-create the mounts. Kernel doesn't do it yet. There > > isn't (that I know of) a clean way to hook lxc-restart to do it. > > Comments? > > It's incomplete but I think you can save the most important portions of > a mount namespace with a simple 1-line command: > > lxc-attach -n cr1 cat /proc/self/mountinfo > cr1.mountinfo > > It's incomplete because: > > 1. It does not adequately address cross-mount-ns bind mounts (IIRC). > > 2. It won't work for nested containers (though I don't know if > lxc supports this already it's not *too* far fetched > to expect folks will ask for it in the future). We can > extend the hack to deal with this by making a small > change in sys_checkpoint but I can't see how to fix #1 > without doing it all in-kernel anyway. Heck, for these examples I don't mind just having a sort of dummy fstab file which both the dummy init and restart use. > The restoration of the mounts is not scriptable however. It involves > parsing the mountinfo file and coordinating the mounts with those done by > lxc itself during lxc-restart. I honestly haven't looked at that closely I'd be fine with requiring some bit of hand-parsing. But right, even once we get a list of the mounts to be restored, I don't know of any good way to get those mounts re-created at the right time. I suppose I could hack lxc-restart to do it. But I'm sort of hoping we can get something less hacked and more true to the 'real' upstream code. > enough yet to say how pretty/ugly that'd be but it entails > modifications to lxc-restart itself. And since #1 above would still > be an issue I'm not sure it's worth doing it that way. So do you know of anyone who's been working on re-creation of mounts in the kernel? If not, what have you been doing, hand-scripting all container creation, checkpoint, and restart? thanks, -serge ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <20101208145245.GB8316-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>]
* RE: trying to build simple checkpoint/restart recipes [not found] ` <20101208145245.GB8316-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> @ 2010-12-08 21:10 ` Rob Landley [not found] ` <7E28E74ACE78074AAD1BDD3E455CF8749422-w6YtkvcGFufufkSEj+1U85Z3qXmFLfmx@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Rob Landley @ 2010-12-08 21:10 UTC (permalink / raw) To: Serge E. Hallyn, Matt Helsley Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > > The restoration of the mounts is not scriptable however. It involves > > parsing the mountinfo file and coordinating the mounts with those done by > > lxc itself during lxc-restart. I honestly haven't looked at that closely > > I'd be fine with requiring some bit of hand-parsing. But right, even > once we get a list of the mounts to be restored, I don't know of any > good way to get those mounts re-created at the right time. Mount code is one of my old stomping grounds from back when I wrote the busybox mount and switch_root commands and had to learn more implementation details about it than I ever wanted to know. :) I never could find a proper mount spec, and kept meaning to write one, but I blathered about some of the less obvious details here: http://www.mail-archive.com/busybox-9GAsQqxh4YTR7s880joybQ@public.gmane.org/msg07013.html There are four top level categories of filesystem: Block backed, ram backed, pipe backed (network and fuse and so on), and synthetic (sysfs, procfs, devtmpfs...). And that's not counting bind mounts (which are internal to the VFS and not really a filesystem), and loopback devices (which are sort of the _opposite_ of a filesystem)... > I suppose I could hack lxc-restart to do it. But I'm sort of hoping we > can get something less hacked and more true to the 'real' upstream > code. Which upstream code? > So do you know of anyone who's been working on re-creation of mounts > in the kernel? If not, what have you been doing, hand-scripting > all container creation, checkpoint, and restart? I express interest in this topic. Rob ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <7E28E74ACE78074AAD1BDD3E455CF8749422-w6YtkvcGFufufkSEj+1U85Z3qXmFLfmx@public.gmane.org>]
* Re: trying to build simple checkpoint/restart recipes [not found] ` <7E28E74ACE78074AAD1BDD3E455CF8749422-w6YtkvcGFufufkSEj+1U85Z3qXmFLfmx@public.gmane.org> @ 2010-12-08 22:26 ` Serge E. Hallyn 0 siblings, 0 replies; 5+ messages in thread From: Serge E. Hallyn @ 2010-12-08 22:26 UTC (permalink / raw) To: Rob Landley Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Quoting Rob Landley (rlandley-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org): > > > The restoration of the mounts is not scriptable however. It involves > > > parsing the mountinfo file and coordinating the mounts with those done by > > > lxc itself during lxc-restart. I honestly haven't looked at that closely > > > > I'd be fine with requiring some bit of hand-parsing. But right, even > > once we get a list of the mounts to be restored, I don't know of any > > good way to get those mounts re-created at the right time. > > Mount code is one of my old stomping grounds from back when I wrote > the busybox mount and switch_root commands and had to learn more > implementation details about it than I ever wanted to know. :) > > I never could find a proper mount spec, and kept meaning to write one, > but I blathered about some of the less obvious details here: > > http://www.mail-archive.com/busybox-9GAsQqxh4YTR7s880joybQ@public.gmane.org/msg07013.html Bookmarked :) > There are four top level categories of filesystem: Block backed, ram backed, > pipe backed (network and fuse and so on), and synthetic (sysfs, procfs, > devtmpfs...). And that's not counting bind mounts (which are internal > to the VFS and not really a filesystem), and loopback devices (which are > sort of the _opposite_ of a filesystem)... Right, for starters handling only bind mounts would be useful. It's feasible for userspace to rsync the contents of tmpfs filesystems during checkpoint and before restart - but it's harder to find the right place for the bind mounts to get re-attached if done in userspace, because we don't want to do it too early and risk having mount leaks (so we can't checkpoint later), and it's hard to coordinate doing it later since someone inside the container has to do it (unlesss, again, we have leaks - well, or maybe having MNT_SHARED / for the container would suffice). > > I suppose I could hack lxc-restart to do it. But I'm sort of hoping we > > can get something less hacked and more true to the 'real' upstream > > code. > > Which upstream code? Heh, I should have said upstream-destined code. Referring to lxc.sf.net and the kernel at www.linux-cr.org. > > So do you know of anyone who's been working on re-creation of mounts > > in the kernel? If not, what have you been doing, hand-scripting > > all container creation, checkpoint, and restart? > > I express interest in this topic. Awesome. Note that we've had lots of prior discussions about the topic, it's just that we never came to a conclusion, so some fresh experienced blood would be very helpful. The last time I went into detail on the topic was at http://www.mail-archive.com/devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org/msg21418.html while some older notes on the simpler topics are at https://ckpt.wiki.kernel.org/index.php/Mounts thanks, -serge ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-12-08 22:26 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-08 4:53 trying to build simple checkpoint/restart recipes Serge E. Hallyn
[not found] ` <20101208045322.GA17602-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2010-12-08 5:53 ` Matt Helsley
[not found] ` <20101208055320.GH10470-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-12-08 14:52 ` Serge E. Hallyn
[not found] ` <20101208145245.GB8316-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
2010-12-08 21:10 ` Rob Landley
[not found] ` <7E28E74ACE78074AAD1BDD3E455CF8749422-w6YtkvcGFufufkSEj+1U85Z3qXmFLfmx@public.gmane.org>
2010-12-08 22:26 ` Serge E. Hallyn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox