* [PATCH 0/3] domUloader
@ 2006-01-16 23:43 Kurt Garloff
2006-01-17 11:52 ` Anthony Liguori
` (2 more replies)
0 siblings, 3 replies; 30+ messages in thread
From: Kurt Garloff @ 2006-01-16 23:43 UTC (permalink / raw)
To: Xen development list
[-- Attachment #1.1: Type: text/plain, Size: 3336 bytes --]
Hi,
one of the troubles with the way xen boots paravirtualized kernels
is that you get the kernel and the initrd from domain 0, whereas
modules that are later loaded are on the domU filesystem.
This is a management headache: You must ensure that the kernel
and initrd you configure in dom0 for domU booting are in sync
with the kernel (modules) in domU.
Jeremy Katz has thankfully created some infrastructure to allow
plugging in bootloaders instead and contributed pygrub.
I extended the infrastructure a bit and added another bootloader.
Unlike pygrub it does not offer a menu and does not parse the
grub menu.lst; it's meant for paravirtualized domains and thus
we accept that the booted kernel is selected differently. By e.g.
a symlink, if one wants to control it from the domU. The bootloader
is called domUloader.
domUloader parses the bootentry (passed via --entry=) and the disk
setup (passed via --disks=). It then sets up loop devices as needed,
scans for partition tables (the exported disks / loop devs can
contain partitions) using kpartx (dm) and sets them up, so the kernel
and initrd can be copied to a temporary location in dom0.
The bootentry may contain a dev: prefix describing the partition
(from a domU perspective!) where kernel and initrd are located,
followed by kernel filename and (optional) initrd filenames relative
to the filesystem on dev:.
The kernel and initrd filename can also be relative to the domU root
filesystem. The domUloader than evaluates /etc/fstab found in the
root filesystem (passed via --root=) to locate kernel and initrd.
Afterwards everything is cleaned up. (We use the destructors, so
python reference counting makes sure this also happens when
exceptions occur.)
Unlike pygrub, it does use any code to understand filesystems or
partitions; the filesystem support comes from the dom0 kernel,
whereas kpartx (from multipath-tools) is used for the knowledge
of partitions and for setting up device-mapper.
More details by calling domUloader.py --help.
An example config could look like this:
bootentry = hda2:/vmlinuz-xen,/initrd-xen
bootloader = /path/to/domUloader.py
disks = ['phy:VG_Xen/LV_dom5,hda,w', 'file:/var/lib/xen/test,sda,w']
...
assuming LV_dom5 has a second partition with a filesystem containing
vmlinuz-xen and initrd-xen in its root fs (the /boot partition).
or
bootentry = /boot/vmlinuz-xen,/boot/initrd-xen
bootloader = /path/to/domUloader.py
root = /dev/hda1
disk = ...
assuming that the root filesystem has an /etc/fstab that points the
way to /boot/vmlinuz-xen. (Does not need to be a separate FS.)
The following three mails will contain
(1) A patch to xend/XenDomainInfo.py, xend/XenBootloader.py and
xm/create.py, making sure that all the needed info is passed
to the bootloader and also stored for reuse on rebooting.
(2) A patch to make pygrub accept the new parameters passed by
XendBootloader.py (but so far pygrub just ignores them ...)
(3) The domUloader.py script.
Patches are against a working copy (8259) on my laptop; if noone
else will, I can create diffs against mercurial tip.
I hope this is useful to someone and can be integrated into the
Xen distribution.
Enjoy,
--
Kurt Garloff, Head Architect, Director SUSE Labs (act.), Novell Inc.
[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-16 23:43 [PATCH 0/3] domUloader Kurt Garloff
@ 2006-01-17 11:52 ` Anthony Liguori
2006-01-17 14:34 ` Kurt Garloff
2006-01-18 18:06 ` Jeremy Katz
2006-01-17 12:33 ` [PATCH] " Tim Deegan
2006-03-22 18:59 ` Matt Ayres
2 siblings, 2 replies; 30+ messages in thread
From: Anthony Liguori @ 2006-01-17 11:52 UTC (permalink / raw)
To: Kurt Garloff; +Cc: Xen development list
Hi Kurt,
Kurt Garloff wrote:
>domUloader parses the bootentry (passed via --entry=) and the disk
>setup (passed via --disks=). It then sets up loop devices as needed,
>scans for partition tables (the exported disks / loop devs can
>contain partitions) using kpartx (dm) and sets them up, so the kernel
>and initrd can be copied to a temporary location in dom0.
>
>
Just to clarify, this means that domU filesystems are being mounted in
dom0? I knew there was some security concerns voiced about this many
months ago. I think one of the advantages to using libext2 was that it
theoritically allowed the filesystem parsing to be done as a
non-privileged user.
Regards,
Anthony Liguori
>The bootentry may contain a dev: prefix describing the partition
>(from a domU perspective!) where kernel and initrd are located,
>followed by kernel filename and (optional) initrd filenames relative
>to the filesystem on dev:.
>The kernel and initrd filename can also be relative to the domU root
>filesystem. The domUloader than evaluates /etc/fstab found in the
>root filesystem (passed via --root=) to locate kernel and initrd.
>Afterwards everything is cleaned up. (We use the destructors, so
>python reference counting makes sure this also happens when
>exceptions occur.)
>
>Unlike pygrub, it does use any code to understand filesystems or
>partitions; the filesystem support comes from the dom0 kernel,
>whereas kpartx (from multipath-tools) is used for the knowledge
>of partitions and for setting up device-mapper.
>
>More details by calling domUloader.py --help.
>
>An example config could look like this:
>bootentry = hda2:/vmlinuz-xen,/initrd-xen
>bootloader = /path/to/domUloader.py
>disks = ['phy:VG_Xen/LV_dom5,hda,w', 'file:/var/lib/xen/test,sda,w']
>...
>assuming LV_dom5 has a second partition with a filesystem containing
>vmlinuz-xen and initrd-xen in its root fs (the /boot partition).
>
>or
>bootentry = /boot/vmlinuz-xen,/boot/initrd-xen
>bootloader = /path/to/domUloader.py
>root = /dev/hda1
>disk = ...
>assuming that the root filesystem has an /etc/fstab that points the
>way to /boot/vmlinuz-xen. (Does not need to be a separate FS.)
>
>The following three mails will contain
>(1) A patch to xend/XenDomainInfo.py, xend/XenBootloader.py and
> xm/create.py, making sure that all the needed info is passed
> to the bootloader and also stored for reuse on rebooting.
>(2) A patch to make pygrub accept the new parameters passed by
> XendBootloader.py (but so far pygrub just ignores them ...)
>(3) The domUloader.py script.
>
>
>Patches are against a working copy (8259) on my laptop; if noone
>else will, I can create diffs against mercurial tip.
>
>I hope this is useful to someone and can be integrated into the
>Xen distribution.
>
>Enjoy,
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Xen-devel mailing list
>Xen-devel@lists.xensource.com
>http://lists.xensource.com/xen-devel
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH] Re: [PATCH 0/3] domUloader
2006-01-16 23:43 [PATCH 0/3] domUloader Kurt Garloff
2006-01-17 11:52 ` Anthony Liguori
@ 2006-01-17 12:33 ` Tim Deegan
[not found] ` <1137607621.22846.17.camel@bree.local.net>
2006-03-22 18:59 ` Matt Ayres
2 siblings, 1 reply; 30+ messages in thread
From: Tim Deegan @ 2006-01-17 12:33 UTC (permalink / raw)
To: Xen development list
[-- Attachment #1: Type: text/plain, Size: 860 bytes --]
Hi,
For discussion, attached is a patch to do something similar using the
pygrub code. It adds another file syntax to xm create, allowing you to
say things like "kernel=guest:(vbda)/boot/vmlinuz", where vbda is what
the new domain will see its block device as.
Because it's based on pygrub, it only handles ext2fs and reiser for
now. Also, it hasn't got partition-table handling, since that's not
working in pygrub either, but it could be added if necessary.
The advantage of this is that the dom0 kernel never needs to read the
domU filesystems, so you could run the extraction code without so much
worrying.
The patch is against -unstable of last month, but I can update it if
people are interested.
Tim.
--
Tim Deegan (My opinions, not the University's)
Systems Research Group
University of Cambridge Computer Laboratory
[-- Attachment #2: guest.patch --]
[-- Type: text/x-patch, Size: 6236 bytes --]
# HG changeset patch
# User tjd21@labyrinth.cl.cam.ac.uk
# Node ID 0cdf19cb2a92a7782dd715ae82cfe9a220c1268c
# Parent 8098cc1daac47f5ac371947a669b9ba15c6cf3f4
Added "guest:(dev)/path/to/file" syntax to xm create filenames
Added pyfscat script to pygrub, needed for guest: syntax
Signed-off-by: Tim Deegan <tjd21@cam.ac.uk>
diff -r 8098cc1daac4 -r 0cdf19cb2a92 tools/pygrub/setup.py
--- a/tools/pygrub/setup.py Sun Dec 4 00:52:38 2005
+++ b/tools/pygrub/setup.py Sun Dec 4 12:20:27 2005
@@ -43,7 +43,7 @@
author_email='katzj@redhat.com',
license='GPL',
package_dir={'grub': 'src'},
- scripts = ["src/pygrub"],
+ scripts = ["src/pygrub", "src/pyfscat"],
packages=pkgs,
ext_modules = fsys_mods
)
diff -r 8098cc1daac4 -r 0cdf19cb2a92 tools/python/xen/xm/create.py
--- a/tools/python/xen/xm/create.py Sun Dec 4 00:52:38 2005
+++ b/tools/python/xen/xm/create.py Sun Dec 4 12:20:27 2005
@@ -26,6 +26,8 @@
import socket
import commands
import time
+import re
+import tempfile
import xen.lowlevel.xc
@@ -387,6 +389,9 @@
fn=set_value, default=None,
use="X11 display to use")
+gopts.var('delete_after', val='FILES',
+ fn=append_value, default=[],
+ use="Files to delete after domain starts")
def err(msg):
"""Print an error to stderr and exit.
@@ -409,13 +414,52 @@
else:
return s
+def file_from_guest(file, vals):
+ """Extract a file from one of the guest disk images.
+ """
+ if len(vals.disk) < 1:
+ err("No disks configured and guest-fs file requested")
+ m = re.match('^\(([^)]+)\)(.*)$', file)
+ if m != None:
+ dev = m.group(1)
+ file = m.group(2)
+ dev_uname = None
+ for disk in vals.disk:
+ (uname, vdev, _, _) = disk
+ if dev == vdev:
+ dev_uname = uname
+ break
+ if dev_uname == None:
+ err("Can't find guest block dev %s to extract %s" % (dev, file))
+ else:
+ (dev_uname, _, _, _) = vals.disk[0]
+ dev_file = blkif.blkdev_uname_to_file(dev_uname)
+ if dev_file == None:
+ err("Can't find guest disk image %s to extract %s" % (dev_uname, file))
+ (tfd, fname) = tempfile.mkstemp()
+ os.close(tfd)
+ rv = os.system("pyfscat -q %s %s %s" % (dev_file, file, fname))
+ if rv != 0:
+ os.unlink(fname)
+ err("Can't extract %s from guest disk image %s" % (file, dev_uname))
+ vals.delete_after.append(fname)
+ return fname
+
+def parse_filename(file, vals):
+ """Parse filename for 'guest:' format
+ """
+ if file.startswith("guest:"):
+ return file_from_guest(file[6:], vals)
+ else:
+ return os.path.abspath(file)
+
def configure_image(vals):
"""Create the image config.
"""
config_image = [ vals.builder ]
- config_image.append([ 'kernel', os.path.abspath(vals.kernel) ])
+ config_image.append([ 'kernel', parse_filename(vals.kernel, vals) ])
if vals.ramdisk:
- config_image.append([ 'ramdisk', os.path.abspath(vals.ramdisk) ])
+ config_image.append([ 'ramdisk', parse_filename(vals.ramdisk, vals) ])
if vals.cmdline_ip:
cmdline_ip = strip('ip=', vals.cmdline_ip)
config_image.append(['ip', cmdline_ip])
@@ -803,6 +847,10 @@
server.xend_domain_destroy(dom)
err("Failed to unpause domain %s" % dom)
opts.info("Started domain %s" % (dom))
+
+ for file in opts.vals.delete_after:
+ os.unlink(file)
+
return int(sxp.child_value(dominfo, 'domid'))
def parseCommandLine(argv):
diff -r 8098cc1daac4 -r 0cdf19cb2a92 tools/pygrub/src/pyfscat
--- /dev/null Sun Dec 4 00:52:38 2005
+++ b/tools/pygrub/src/pyfscat Sun Dec 4 12:20:27 2005
@@ -0,0 +1,82 @@
+#!/usr/bin/python
+#
+# pyfscat - extract a file's contents from a filesystem image.
+#
+# Copyright (C) 2005 XenSource Ltd
+#
+# Based on pygrub, Copyright 2005 Red Hat, Inc.
+# Jeremy Katz <katzj@redhat.com>
+#
+# This software may be freely redistributed under the terms of the GNU
+# general public license.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+#
+
+import os, sys
+
+sys.path = [ '/usr/lib/python' ] + sys.path
+
+import grub.fsys
+
+def file_from_fs_image(dev_path, path, out_path, quiet):
+ """Extract a file from a filesystem image using pygrub's libraries.
+ """
+ offset = 0
+ fs = None
+ for fstype in grub.fsys.fstypes.values():
+ if fstype.sniff_magic(dev_path, offset):
+ fs = fstype.open_fs(dev_path, offset)
+ break
+ if fs is None:
+ if not quiet:
+ print >> sys.stderr, ("Can't open a filesystem image in %s"
+ % (dev_path))
+ return False
+ if not fs.file_exist(path):
+ if not quiet:
+ print >> sys.stderr, ("Can't find file %s in %s"
+ % (path, dev_path))
+ return False
+ fd = fs.open_file(path, os.O_RDONLY)
+ contents = fd.read()
+
+ # We have something to write: fix up somewhere to put it
+ if out_path == "-":
+ sys.stdout.write(contents)
+ else:
+ try:
+ out_file = open(out_path, 'w')
+ out_file.write(contents)
+ out_file.close()
+ except IOError:
+ if not quiet:
+ print >> sys.stderr, "Can't write to %s" % (out_path)
+ return False
+ return True
+
+def usage():
+ print >> sys.stderr, ("Usage: %s [-q] <image> <path> [<dest-file>]\n"
+ "\n"
+ "Extracts file <path> from fs image <image> "
+ "to <dest-file> (or stdout)"
+ % (sys.argv[0],))
+
+
+quiet = False
+if len(sys.argv) > 1 and sys.argv[1] == "-q":
+ quiet = True
+ del sys.argv[1]
+
+if len(sys.argv) == 3:
+ out_path = "-"
+elif len(sys.argv) == 4:
+ out_path = sys.argv[3]
+else:
+ usage()
+ sys.exit(1)
+
+if not file_from_fs_image(sys.argv[1], sys.argv[2], out_path, quiet):
+ sys.exit(1)
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-17 11:52 ` Anthony Liguori
@ 2006-01-17 14:34 ` Kurt Garloff
2006-01-17 17:28 ` Adam Heath
2006-01-17 21:41 ` Anthony Liguori
2006-01-18 18:06 ` Jeremy Katz
1 sibling, 2 replies; 30+ messages in thread
From: Kurt Garloff @ 2006-01-17 14:34 UTC (permalink / raw)
To: Anthony Liguori; +Cc: Xen development list
[-- Attachment #1.1: Type: text/plain, Size: 1893 bytes --]
Hi Anthony,
On Tue, Jan 17, 2006 at 05:52:14AM -0600, Anthony Liguori wrote:
> Just to clarify, this means that domU filesystems are being mounted in
> dom0?
Correct.
> I knew there was some security concerns voiced about this many
> months ago. I think one of the advantages to using libext2 was that it
> theoritically allowed the filesystem parsing to be done as a
> non-privileged user.
I can see your point.
There's two concerns you could have:
1. When the domU fs gets mounted in dom0, a local user there could
get (read-only) access to data that he shouldn't have access to.
This can be prevented by mounting under a directory that's not
readable to anyone but root. I didn't do this in my patch set,
but it's certainly a good idea.
(And dom0 root you need to trust anyway, such is the trust model
in a hybrid virtualization model without encrypting everything.)
2. The filesystem in the domU could be prepared such that the kernel
trips over a bug in its filesystem code.
The same can happen if you read the FS with a userspace library
of course, but the effects would be less bad -- at least if you
would do it with non-root euid.
The downside is that need to use a secondary source for filesystem
code, which needs to be maintained and kept in sync, audited, ...
And you are limited to the filesystems where you have userspace
libraries for.
In a paranoid scenario, you would not load any data from the domU
filesystem in any way :-) But I can see why you would choose
pygrub over domUloader in a sensitive environment, where you
can't trust the domU admins. Point taken.
I still think that in many use scenarios, you would be perfectly
fine with domUloader.
Did I catch your concerns?
--
Kurt Garloff, Head Architect, Director SUSE Labs (act.), Novell Inc.
[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-17 14:34 ` Kurt Garloff
@ 2006-01-17 17:28 ` Adam Heath
2006-01-17 21:28 ` Kurt Garloff
2006-01-17 21:41 ` Anthony Liguori
1 sibling, 1 reply; 30+ messages in thread
From: Adam Heath @ 2006-01-17 17:28 UTC (permalink / raw)
To: Kurt Garloff; +Cc: Xen development list
On Tue, 17 Jan 2006, Kurt Garloff wrote:
> 2. The filesystem in the domU could be prepared such that the kernel
> trips over a bug in its filesystem code.
> The same can happen if you read the FS with a userspace library
> of course, but the effects would be less bad -- at least if you
> would do it with non-root euid.
> The downside is that need to use a secondary source for filesystem
> code, which needs to be maintained and kept in sync, audited, ...
> And you are limited to the filesystems where you have userspace
> libraries for.
> In a paranoid scenario, you would not load any data from the domU
> filesystem in any way :-) But I can see why you would choose
> pygrub over domUloader in a sensitive environment, where you
> can't trust the domU admins. Point taken.
> I still think that in many use scenarios, you would be perfectly
> fine with domUloader.
Have a special kernel that is used just for this, then boot a temporary domU,
using this special kernel, read the data you need from the filesystem, then
shut it down.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-17 17:28 ` Adam Heath
@ 2006-01-17 21:28 ` Kurt Garloff
0 siblings, 0 replies; 30+ messages in thread
From: Kurt Garloff @ 2006-01-17 21:28 UTC (permalink / raw)
To: Adam Heath; +Cc: Xen development list
[-- Attachment #1.1: Type: text/plain, Size: 967 bytes --]
Hi Adam,
On Tue, Jan 17, 2006 at 11:28:58AM -0600, Adam Heath wrote:
> On Tue, 17 Jan 2006, Kurt Garloff wrote:
>
> > In a paranoid scenario, you would not load any data from the domU
> > filesystem in any way :-) But I can see why you would choose
> > pygrub over domUloader in a sensitive environment, where you
> > can't trust the domU admins. Point taken.
> > I still think that in many use scenarios, you would be perfectly
> > fine with domUloader.
>
> Have a special kernel that is used just for this, then boot a temporary domU,
> using this special kernel, read the data you need from the filesystem, then
> shut it down.
Good solution but quite complex ...
I wonder whether it would be easier porting grub to xen.
For now something simple that just works and is secure enough for 90+%
of the users does not look so bad to me.
Best,
--
Kurt Garloff, Head Architect, Director SUSE Labs (act.), Novell Inc.
[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-17 14:34 ` Kurt Garloff
2006-01-17 17:28 ` Adam Heath
@ 2006-01-17 21:41 ` Anthony Liguori
1 sibling, 0 replies; 30+ messages in thread
From: Anthony Liguori @ 2006-01-17 21:41 UTC (permalink / raw)
To: Kurt Garloff; +Cc: Xen development list
Kurt Garloff wrote:
>Hi Anthony,
>
>
>>I knew there was some security concerns voiced about this many
>>months ago. I think one of the advantages to using libext2 was that it
>>theoritically allowed the filesystem parsing to be done as a
>>non-privileged user.
>>
>>
>
>I can see your point.
>
>There's two concerns you could have:
>
>1. When the domU fs gets mounted in dom0, a local user there could
> get (read-only) access to data that he shouldn't have access to.
> This can be prevented by mounting under a directory that's not
> readable to anyone but root. I didn't do this in my patch set,
> but it's certainly a good idea.
> (And dom0 root you need to trust anyway, such is the trust model
> in a hybrid virtualization model without encrypting everything.)
>
>2. The filesystem in the domU could be prepared such that the kernel
> trips over a bug in its filesystem code.
> The same can happen if you read the FS with a userspace library
> of course, but the effects would be less bad -- at least if you
> would do it with non-root euid.
> The downside is that need to use a secondary source for filesystem
> code, which needs to be maintained and kept in sync, audited, ...
> And you are limited to the filesystems where you have userspace
> libraries for.
> In a paranoid scenario, you would not load any data from the domU
> filesystem in any way :-) But I can see why you would choose
> pygrub over domUloader in a sensitive environment, where you
> can't trust the domU admins. Point taken.
> I still think that in many use scenarios, you would be perfectly
> fine with domUloader.
>
>Did I catch your concerns?
>
>
Yup, just wanted to make sure it was considered :-)
Regards,
Anthony Liguori
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-17 11:52 ` Anthony Liguori
2006-01-17 14:34 ` Kurt Garloff
@ 2006-01-18 18:06 ` Jeremy Katz
2006-01-18 23:21 ` Kurt Garloff
1 sibling, 1 reply; 30+ messages in thread
From: Jeremy Katz @ 2006-01-18 18:06 UTC (permalink / raw)
To: Anthony Liguori; +Cc: Xen development list, Kurt Garloff
On Tue, 2006-01-17 at 05:52 -0600, Anthony Liguori wrote:
> Kurt Garloff wrote:
> >domUloader parses the bootentry (passed via --entry=) and the disk
> >setup (passed via --disks=). It then sets up loop devices as needed,
> >scans for partition tables (the exported disks / loop devs can
> >contain partitions) using kpartx (dm) and sets them up, so the kernel
> >and initrd can be copied to a temporary location in dom0.
> >
> Just to clarify, this means that domU filesystems are being mounted in
> dom0? I knew there was some security concerns voiced about this many
> months ago. I think one of the advantages to using libext2 was that it
> theoritically allowed the filesystem parsing to be done as a
> non-privileged user.
The other concern with mounting is that there have been some cases where
changes to filesystems have broken reading new filesystems with older
kernels. It's a lot easier to get the library that supports more (and
less has to be supported, so you're less likely to need to make changes)
than to upgrade your kernel for dom0
Jeremy
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-18 18:06 ` Jeremy Katz
@ 2006-01-18 23:21 ` Kurt Garloff
2006-01-19 4:31 ` Anthony Liguori
0 siblings, 1 reply; 30+ messages in thread
From: Kurt Garloff @ 2006-01-18 23:21 UTC (permalink / raw)
To: Jeremy Katz; +Cc: Xen development list
[-- Attachment #1.1: Type: text/plain, Size: 892 bytes --]
Hi Jeremy,
On Wed, Jan 18, 2006 at 01:06:04PM -0500, Jeremy Katz wrote:
> The other concern with mounting is that there have been some cases where
> changes to filesystems have broken reading new filesystems with older
> kernels. It's a lot easier to get the library that supports more (and
> less has to be supported, so you're less likely to need to make changes)
> than to upgrade your kernel for dom0
I tend to disagree.
As the dom0 kernel drives the hardware and hardware drivers seems
to be the more prominent reason for moving to new kernel versions,
I would assume the dom0 kernel to be updated more likely than the
domU kernels.
And actually, filesystem forward compatibility is not that bad.
I'm not saying this can't be an issue, but I suspect it won't be
for most people.
Cheers,
--
Kurt Garloff, Head Architect, Director SUSE Labs (act.), Novell Inc.
[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-18 23:21 ` Kurt Garloff
@ 2006-01-19 4:31 ` Anthony Liguori
2006-01-19 17:19 ` Jeremy Katz
0 siblings, 1 reply; 30+ messages in thread
From: Anthony Liguori @ 2006-01-19 4:31 UTC (permalink / raw)
To: Kurt Garloff; +Cc: Jeremy Katz, Xen development list
On a side note, one thing we all have to think about is how a boot
loader would work with something like a virtual framebuffer.
It may be time to start thinking about writing a first class domU
bootloader. Something that just sets up a page table that maps the pfns
linearly and enough XenBus to read from a virtual disk. We can reuse
code from grub for filesystem parsing (or even write it from
scratch--it's not that hard to just read from a filesystem).
We could also use mini-OS as a base.
Regards,
Anthony Liguori
Kurt Garloff wrote:
>Hi Jeremy,
>
>On Wed, Jan 18, 2006 at 01:06:04PM -0500, Jeremy Katz wrote:
>
>
>>The other concern with mounting is that there have been some cases where
>>changes to filesystems have broken reading new filesystems with older
>>kernels. It's a lot easier to get the library that supports more (and
>>less has to be supported, so you're less likely to need to make changes)
>>than to upgrade your kernel for dom0
>>
>>
>
>I tend to disagree.
>As the dom0 kernel drives the hardware and hardware drivers seems
>to be the more prominent reason for moving to new kernel versions,
>I would assume the dom0 kernel to be updated more likely than the
>domU kernels.
>And actually, filesystem forward compatibility is not that bad.
>
>I'm not saying this can't be an issue, but I suspect it won't be
>for most people.
>
>Cheers,
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] Re: [PATCH 0/3] domUloader
[not found] ` <1137607621.22846.17.camel@bree.local.net>
@ 2006-01-19 13:06 ` Tim Deegan
2006-01-20 12:43 ` Kurt Garloff
2006-01-23 13:39 ` Tim Deegan
0 siblings, 2 replies; 30+ messages in thread
From: Tim Deegan @ 2006-01-19 13:06 UTC (permalink / raw)
To: Jeremy Katz; +Cc: xen-devel
On Wed, Jan 18, 2006 at 01:07:00PM -0500, Jeremy Katz wrote:
> Sounds reasonable enough, although I'll have to look at it a little
> closer when I get back from Austin. FWIW, partition table handling in
> pygrub should work fine (I'm installing to full disk vbds with partition
> tables regularly)
The partition handling is only enough to find the "active" partition, so
it doesn't handle extended partitions. That's not a problem for pygrub,
but would need to be done to have the extraction tool handle partitions
properly.
Also, it doesn't work if your e2fsprogs are too old to have
ext2fs_open2() -- again, not really a bug but the failure mode is a bit
ugly, and the version in the Xen 3 tarball has this problem. Is there
some way of telling from inside a python script whether the pygrub
library is going to be able to read partitions or not?
Tim.
--
Tim Deegan (My opinions, not the University's)
Systems Research Group
University of Cambridge Computer Laboratory
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-19 4:31 ` Anthony Liguori
@ 2006-01-19 17:19 ` Jeremy Katz
2006-01-20 20:36 ` Stephen Tweedie
0 siblings, 1 reply; 30+ messages in thread
From: Jeremy Katz @ 2006-01-19 17:19 UTC (permalink / raw)
To: Anthony Liguori; +Cc: Xen development list, Kurt Garloff
On Wed, 2006-01-18 at 22:31 -0600, Anthony Liguori wrote:
> On a side note, one thing we all have to think about is how a boot
> loader would work with something like a virtual framebuffer.
Yeah :/
> It may be time to start thinking about writing a first class domU
> bootloader. Something that just sets up a page table that maps the pfns
> linearly and enough XenBus to read from a virtual disk. We can reuse
> code from grub for filesystem parsing (or even write it from
> scratch--it's not that hard to just read from a filesystem).
>
> We could also use mini-OS as a base.
The problem is where does something like this end? So we add a basic
blkfront. Then someone wants to do some form of netboot. Or boot on
iSCSI. Or they use something like GFS or OCFS2 which require
significantly more infrastructure than most filesystems. And then,
there is a world of pain :/
Unfortunately, I am completely convinced that the right thing is to have
the kernel for domU inside the domU's filesystem because anything else
is just fundamentally not manageable. So, perhaps we do have to just
suck it up and go the path of what's essentially mini-OS as a domU
"bios"
Jeremy
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] Re: [PATCH 0/3] domUloader
2006-01-19 13:06 ` Tim Deegan
@ 2006-01-20 12:43 ` Kurt Garloff
2006-01-23 13:39 ` Tim Deegan
1 sibling, 0 replies; 30+ messages in thread
From: Kurt Garloff @ 2006-01-20 12:43 UTC (permalink / raw)
To: Tim Deegan; +Cc: Jeremy Katz, xen-devel
[-- Attachment #1.1.1: Type: text/plain, Size: 2290 bytes --]
Hi Tim, Jeremy,
On Thu, Jan 19, 2006 at 01:06:18PM +0000, Tim Deegan wrote:
> On Wed, Jan 18, 2006 at 01:07:00PM -0500, Jeremy Katz wrote:
> > Sounds reasonable enough, although I'll have to look at it a little
> > closer when I get back from Austin. FWIW, partition table handling in
> > pygrub should work fine (I'm installing to full disk vbds with partition
> > tables regularly)
>
> The partition handling is only enough to find the "active" partition, so
> it doesn't handle extended partitions. That's not a problem for pygrub,
> but would need to be done to have the extraction tool handle partitions
> properly.
pygrub does assume there's one whole disk export which contains
a DOS style partition table with the /boot partition marked active.
If that one is ext2, everything works fine. (The reiser support
failed in my testing.)
domUloader is more flexible there. It understands both whole disk
devices and partitions, can handle a set of them, finds the /boot
partition by having it specified in bootentry or by parsing
/etc/fstab on the partition that's been passed to domU with root=.
Maybe we want to import domUloader in pygrub to just get all this,
as the handling is nicely abstracted there. You just don't call
domUloader.main(argv) then ...
If we do that, the main difference will be that
- pygrub offers an interactive (ncurses) mode
- pygrub users libraries to get files off the FS, which could be
somewhat safer and more easily extensible (but currently is not
(yet?) very reliable and limits the FS choice).
And I'd really like to advocate for the little changes I have done to
XendDomainInfo, XendBootloader and create, so we pass enough information
to the bootloader to handle all this.
> Also, it doesn't work if your e2fsprogs are too old to have
> ext2fs_open2() -- again, not really a bug but the failure mode is a bit
> ugly,
So I'm not the only that experiences pygrub running OOM?
> and the version in the Xen 3 tarball has this problem. Is there
> some way of telling from inside a python script whether the pygrub
> library is going to be able to read partitions or not?
Updated domUloader attached.
Best,
--
Kurt Garloff, Head Architect, Director SUSE Labs (act.), Novell Inc.
[-- Attachment #1.1.2: domUloader.py --]
[-- Type: text/plain, Size: 14527 bytes --]
#!/usr/bin/env python
# domUloader.py
"""Loader for kernel and (optional) ramdisk from domU filesystem
Uses bootentry = [dev:]kernel[,initrd] to get a kernel [and initrd]
from a domU filesystem to boot it for Xen.
dev is the disk as seen by domU, filenames are relative to
that filesystem. The script uses the disk settings from the
config file to find the domU filesystems.
The bootentry is passed to the script using --entry=
Optionally, dev: can be omitted; the script then looks at the
root filesystem, parses /etc/fstab to resolve the path to the
kernel [and the initrd]. Obviously, the paths relative to the
domU root filesystem needs to be specified for the kernel
and initrd filenames.
The root FS is passed using --root=, the filesystem setup in
--disks=. The disks list is a python list
[[uname, dev, mode, backend], [uname, dev, mode, backend], ...]
passed as a string. The script writes an sxpr specifying the
locations of the copied kernel and initrd into the file
specified by --output (default is stdout).
Limitations:
- It is assumed both kernel and initrd are on the same filesystem.
- domUs might use LVM; the script currently does not have support
for setting up LVM mappings for domUs; it's not trivial and we
might risk namespace conflicts. If you want to use LVM inside domUs,
set up a small non-LVM boot partition and specify it in bootentry.
The script uses kpartx (multipath-tools) to create mappings for
devices that are exported as whole disk devices that are partitioned.
(c) 01/2006 Novell Inc
License: GNU GPL
Author: Kurt Garloff <garloff@suse.de>
"""
import os, sys, getopt
from xen.xend import sxp
import tempfile
# Global options
quiet = False
verbose = False
tmpdir = '/var/lib/xen/tmp'
# List of partitions
# It's created by setting up the all the devices from the xen disk
# config; every entry creates on Wholedisk object, which does necessary
# preparatory steps such as losetup and kpartx -a; then a Partition
# object is setup for every partition (which may be one or several per
# Wholedisk); it references the Wholedisk if needed; python reference
# counting will take care of the cleanup.
partitions = []
# Helper functions
def traildigits(strg):
"Return the trailing digits, used to split the partition number off"
idx = len(strg)-1
while strg[idx].isdigit():
if len == 0:
return strg
idx -= 1
return strg[idx+1:]
def isWholedisk(domUname):
"Determines whether dev is a wholedisk dev"
return not domUname[-1:].isdigit()
def freeLoopDev():
"Finds a free loop device; racy!"
loops = []
fd = os.popen("losetup -a")
for ln in fd.readlines():
loops.append(ln.split(':')[0])
for nr in range(0,256):
if "/dev/loop%i" % nr not in loops:
return "/dev/loop%i" % nr
return None
def findPart(dev):
"Find device dev in list of partitions"
if len(dev) > 5 and dev[:5] == "/dev/":
dev = dev[5:]
for part in partitions:
if dev == part.domname:
return part
return None
class Wholedisk:
"Class representing a whole disk that has partitions"
def __init__(self, domname, physdev, loopfile = None):
"c'tor: set up"
self.domname = domname
self.physdev = physdev
self.loopfile = loopfile
self.mapped = 0
self.pcount = self.scanpartitions()
def loopsetup(self):
"Setup the loop mapping"
if self.loopfile and not self.physdev:
ldev = freeLoopDev()
if not ldev:
raise RuntimeError("domUloader: No free loop device found")
if verbose:
print "domUloader: losetup %s %s" % (ldev, self.loopfile)
fd = os.popen("losetup %s %s" % (ldev, self.loopfile))
if fd.close():
raise RuntimeError("domUloader: Failure setting up loop dev")
self.physdev = ldev
def loopclean(self):
"Delete the loop mapping"
if self.loopfile and self.physdev:
if verbose:
print "domUloader: losetup -d %s" % self.physdev
fd = os.popen("losetup -d %s" % self.physdev)
self.physdev = None
return fd.close()
def scanpartitions(self):
"""Scan device for partitions (kpartx -l) and set up data structures,
Returns number of partitions found."""
self.loopsetup()
if not self.physdev:
raise RuntimeError("domUloader: No physical device? %s" % self.__repr__())
# TODO: We could use fdisk -l instead and look at the type of
# partitions; this way we could also detect LVM and support it.
fd = os.popen("kpartx -l %s" % self.physdev)
pcount = 0
for line in fd.readlines():
line = line.strip()
(pname, params) = line.split(':')
pno = int(traildigits(pname.strip()))
#if pname.rfind('/') != -1:
# pname = pname[pname.rfind('/')+1:]
#pname = self.physdev[:self.physdev.rfind('/')] + '/' + pname
pname = "/dev/mapper/" + pname
partitions.append(Partition(self, self.domname + '%i' % pno, pname))
pcount += 1
fd.close()
if not pcount:
if self.loopfile:
ref = self
else:
ref = None
partitions.append(Partition(ref, self.domname, self.physdev))
self.loopclean()
return pcount
def activatepartitions(self):
"Set up loop mapping and device-mapper mappings"
if not self.mapped:
self.loopsetup()
if self.pcount:
if verbose:
print "domUloader: kpartx -a %s" % self.physdev
fd = os.popen("kpartx -a %s" % self.physdev)
fd.close()
self.mapped += 1
def deactivatepartitions(self):
"Remove device-mapper mappings and loop mapping"
if not self.mapped:
return
self.mapped -= 1
if not self.mapped:
if self.pcount:
if verbose:
print "domUloader: kpartx -d %s" % self.physdev
fd = os.popen("kpartx -d %s" % self.physdev)
fd.close()
self.loopclean()
def __del__(self):
"d'tor: clean up"
self.deactivatepartitions()
self.loopclean()
def __repr__(self):
"string representation for debugging"
strg = "[" + self.domname + "," + self.physdev + ","
if self.loopfile:
strg += self.loopfile
strg += "," + str(self.pcount) + ",mapped %ix]" % self.mapped
return strg
class Partition:
"""Class representing a domU filesystem (partition) that can be
mounted in dom0"""
def __init__(self, whole = None, domname = None,
physdev = None):
"c'tor: setup"
self.wholedisk = whole
self.domname = domname
self.physdev = physdev
self.mountpoint = None
def __del__(self):
"d'tor: cleanup"
if self.mountpoint:
self.umount()
# Not needed: Refcounting will take care of it.
#if self.wholedisk:
# self.wholedisk.deactivatepartitions()
def __repr__(self):
"string representation for debugging"
strg = "[" + self.domname + "," + self.physdev + ","
if self.mountpoint:
strg += "mounted on " + self.mountpoint + ","
else:
strg += "not mounted,"
if self.wholedisk:
return strg + self.wholedisk.__repr__() + "]"
else:
return strg + "]"
def mount(self, fstype = None, options = "ro"):
"mount filesystem, sets self.mountpoint"
if self.mountpoint:
return
if self.wholedisk:
self.wholedisk.activatepartitions()
mtpt = tempfile.mkdtemp(prefix = "%s." % self.domname, dir = tmpdir)
mopts = ""
if fstype:
mopts += " -t %s" % fstype
mopts += " -o %s" % options
if verbose:
print "domUloader: mount %s %s %s" % (mopts, self.physdev, mtpt)
fd = os.popen("mount %s %s %s" % (mopts, self.physdev, mtpt))
err = fd.close()
if err:
raise RuntimeError("domUloader: Error %i from mount %s %s on %s" % \
(err, mopts, self.physdev, mtpt))
self.mountpoint = mtpt
def umount(self):
"umount filesystem at self.mountpoint"
if not self.mountpoint:
return
if verbose:
print "domUloader: umount %s" % self.mountpoint
fd = os.popen("umount %s" % self.mountpoint)
err = fd.close()
os.rmdir(self.mountpoint)
if err:
raise RuntimeError("Error %i from umount %s" % \
(err, self.mountpoint))
self.mountpoint = None
if self.wholedisk:
self.wholedisk.deactivatepartitions()
def setupOneDisk(cfg):
"""Sets up one exported disk (incl. partitions if existing)
@param cfg: 4-tuple (uname, dev, mode, backend)"""
from xen.util.blkif import blkdev_uname_to_file
(type, dev) = cfg[0].split(':')
(loopfile, physdev) = (None, None)
if type == "file":
loopfile = dev
elif type == "phy":
physdev = blkdev_uname_to_file(cfg[0])
wdisk = Wholedisk(cfg[1], physdev, loopfile)
def setupDisks(vbds):
"""Create a list of all disks from the disk config:
@param vbds: The disk config as list of 4-tuples
(uname, dev, mode, backend)"""
disks = eval(eval(vbds))
for disk in disks:
setupOneDisk(disk)
if verbose:
print "Partitions: " + str(partitions)
class Fstab:
"Class representing an fstab"
class FstabEntry:
"Class representing one fstab line"
def __init__(self, line):
"c'tor: parses one line"
spline = line.split()
self.dev, self.mtpt, self.fstype, self.opts = \
spline[0], spline[1], spline[2], spline[3]
if len(self.mtpt) > 1:
self.mtpt = self.mtpt.rstrip('/')
def __init__(self, filename):
"c'tor: parses fstab"
self.entries = []
fd = open(filename)
for line in fd.readlines():
line = line.strip()
if len(line) == 0 or line[0] == '#':
continue
self.entries.append(Fstab.FstabEntry(line))
def find(self, fname):
"Looks for matching filesystem in fstab"
matchlen = 0
match = None
fnmlst = fname.split('/')
for fs in self.entries:
entlst = fs.mtpt.split('/')
# '/' needs special treatment :-(
if entlst == ['','']:
entlst = ['']
entln = len(entlst)
if len(fnmlst) >= entln and fnmlst[:entln] == entlst \
and entln > matchlen:
match = fs
matchlen = entln
if not match:
return (None, None)
return (match.dev, match.mtpt)
def fsFromFstab(kernel, initrd, root):
"""Investigate rootFS fstab, check for filesystem that contains the kernel
and return it; also returns adapted kernel and initrd path.
"""
part = findPart(root)
if not part:
raise RuntimeError("domUloader: Root fs %s not exported?" % root)
part.mount()
if not os.access(part.mountpoint + '/etc/fstab', os.R_OK):
part.umount()
raise RuntimeError("domUloader: /etc/fstab not found on %s" % root)
fstab = Fstab(part.mountpoint + '/etc/fstab')
(dev, fs) = fstab.find(kernel)
if not fs:
raise RuntimeError("domUloader: no matching filesystem for image %s found in fstab" % kernel)
#return (None, kernel, initrd)
if fs == '/':
ln = 0
# this avoids the stupid /dev/root problem
dev = root
else:
ln = len(fs)
kernel = kernel[ln:]
if initrd:
initrd = initrd[ln:]
if verbose:
print "fsFromFstab: %s %s -- %s,%s" % (dev, fs, kernel, initrd)
return (kernel, initrd, dev)
def parseEntry(entry):
"disects bootentry and returns kernel, initrd, filesys"
fs = None
initrd = None
fsspl = entry.split(':')
if len(fsspl) > 1:
fs = fsspl[0]
entry = fsspl[1]
enspl = entry.split(',')
# Prepend '/' if missing
kernel = enspl[0]
if kernel[0] != '/':
kernel = '/' + kernel
if len(enspl) > 1:
initrd = enspl[1]
if initrd[0] != '/':
initrd = '/' + initrd
return kernel, initrd, fs
def copyFile(src, dst):
"Wrapper for shutil.filecopy"
import shutil
if verbose:
print "domUloader: cp %s %s" % (src, dst)
stat = os.stat(src)
if stat.st_size > 16*1024*1024:
raise RuntimeError("Too large file %s (%s larger than 16MB)" \
% (src, stat.st_size))
try:
shutil.copyfile(src, dst)
except:
os.unlink(dst)
raise()
def copyKernelAndInitrd(fs, kernel, initrd):
"""Finds fs in list of partitions, mounts the partition, copies
kernel [and initrd] off to dom0 files, umounts the parition again,
and returns sxpr pointing to these copies."""
import shutil
part = findPart(fs)
if not part:
raise RuntimeError("domUloader: Filesystem %s not exported\n" % fs)
part.mount()
try:
(fd, knm) = tempfile.mkstemp(prefix = "vmlinuz.", dir = tmpdir)
os.close(fd)
copyFile(part.mountpoint + kernel, knm)
except:
os.unlink(knm)
raise
if not quiet:
print "Copy kernel %s from %s to %s for booting" % \
(kernel, fs, knm)
sxpr = "linux (kernel %s)" % knm
if (initrd):
try:
(fd, inm) = tempfile.mkstemp(prefix = "initrd.", dir = tmpdir)
os.close(fd)
copyFile(part.mountpoint + initrd, inm)
except:
os.unlink(knm)
os.unlink(inm)
raise
sxpr += "(ramdisk %s)" % inm
part.umount()
return sxpr
def main(argv):
"Main routine: Parses options etc."
global quiet, verbose, tmpdir
def usage():
"Help output (usage info)"
global verbose, quiet
print >> sys.stderr, "domUloader usage: domUloader --disks=disklist [--root=rootFS]\n" \
+ " --entry=kernel[,initrd] [--output=fd] [--quiet] [--verbose] [--help]\n"
print >> sys.stderr, __doc__
#print "domUloader " + str(argv)
try:
(optlist, args) = getopt.gnu_getopt(argv, 'qvh', \
('disks=', 'root=', 'entry=', 'output=',
'tmpdir=', 'help', 'quiet', 'verbose'))
except:
usage()
sys.exit(1)
entry = None
output = None
root = None
disks = None
for (opt, oarg) in optlist:
if opt in ('-h', '--help'):
usage()
sys.exit(0)
elif opt in ('-q', '--quiet'):
quiet = True
elif opt in ('-v', '--verbose'):
verbose = True
elif opt == '--root':
root = oarg
elif opt == '--output':
output = oarg
elif opt == '--disks':
disks = oarg
elif opt == '--entry':
entry = oarg
elif opt == '--tmpdir':
tmpdir = oarg
if not entry or not disks:
usage()
sys.exit(1)
if output is None or output == "-":
fd = sys.stdout.fileno()
else:
fd = os.open(output, os.O_WRONLY)
if not os.access(tmpdir, os.X_OK):
os.mkdir(tmpdir)
os.chmod(tmpdir, 0750)
# We assume kernel and initrd are on the same FS,
# so only one fs
kernel, initrd, fs = parseEntry(entry)
setupDisks(disks)
if not fs:
if not root:
usage()
raise RuntimeError("domUloader: No root= to parse fstab and no disk in bootentry")
sys.exit(1)
kernel, initrd, fs = fsFromFstab(kernel, initrd, root)
sxpr = copyKernelAndInitrd(fs, kernel, initrd)
sys.stdout.flush()
os.write(fd, sxpr)
# Call main if called (and not imported)
if __name__ == "__main__":
main(sys.argv)
[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-19 17:19 ` Jeremy Katz
@ 2006-01-20 20:36 ` Stephen Tweedie
2006-01-20 23:08 ` Philip R. Auld
0 siblings, 1 reply; 30+ messages in thread
From: Stephen Tweedie @ 2006-01-20 20:36 UTC (permalink / raw)
To: Jeremy Katz; +Cc: Xen development list, Kurt Garloff
Hi,
On Thu, Jan 19, 2006 at 12:19:53PM -0500, Jeremy Katz wrote:
> Unfortunately, I am completely convinced that the right thing is to have
> the kernel for domU inside the domU's filesystem because anything else
> is just fundamentally not manageable.
I tend to agree. The real trouble starts when the storage needed to
boot that domU isn't even visible to the dom0, though --- perhaps
because we've got a virtual HBA (say an iSCSI initiator, or virtual FC
HBA), connected to a SAN which is filtering by initiator so that only
the domU can see the LUN's contents.
Bootstrapping that sort of environment is nasty.
> So, perhaps we do have to just
> suck it up and go the path of what's essentially mini-OS as a domU
> "bios"
If the domU pre-boot is going to have to have enough smarts to run a
full iSCSI initiator then we might be better off just biting the
bullet and running a proper kernel there with either kexec, or some
manner of domU respawn, to boot the correct kernel/initrd once the
pre-boot one has downloaded them. Either that, or we basically need
to have special cases for /boot to make sure that those files, plus
the grub.conf-type kernel args, are registered elsewhere (directory?)
for the dom0 to get them from.
--Stephen
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-20 20:36 ` Stephen Tweedie
@ 2006-01-20 23:08 ` Philip R. Auld
2006-01-23 14:19 ` Kurt Garloff
0 siblings, 1 reply; 30+ messages in thread
From: Philip R. Auld @ 2006-01-20 23:08 UTC (permalink / raw)
To: Stephen Tweedie; +Cc: Jeremy Katz, Xen development list, Kurt Garloff
Hi,
Rumor has it that on Fri, Jan 20, 2006 at 03:36:33PM -0500 Stephen Tweedie said:
> Hi,
>
> On Thu, Jan 19, 2006 at 12:19:53PM -0500, Jeremy Katz wrote:
>
> > Unfortunately, I am completely convinced that the right thing is to have
> > the kernel for domU inside the domU's filesystem because anything else
> > is just fundamentally not manageable.
>
> I tend to agree. The real trouble starts when the storage needed to
> boot that domU isn't even visible to the dom0, though --- perhaps
> because we've got a virtual HBA (say an iSCSI initiator, or virtual FC
> HBA), connected to a SAN which is filtering by initiator so that only
> the domU can see the LUN's contents.
>
> Bootstrapping that sort of environment is nasty.
>
Indeed. I agree with the domU filesystem approach as well.
How does one boot off of software iSCSI on a physical machine now?
There's clearly no boot ROM. I'm guessing people use PXE. Then
it's up to the initrd or initramfs to have the iSCSI smarts.
In that case the domU pre-boot would need to support vbd, v-nics and
have a PXE client.
Someone would have to serve the PXE requests and handle the images
for that of course. But I think treating it more like a BIOS on
a physical machine rather than something that boots in ways that
a physical machine doesn't. The more like real machines the VMs look
the better.
Of course, that would mean support for root on iSCSI in the installer
and mkinitrd code.
Anyway, my 2 cents...
Cheers,
Phil
> > So, perhaps we do have to just
> > suck it up and go the path of what's essentially mini-OS as a domU
> > "bios"
>
> If the domU pre-boot is going to have to have enough smarts to run a
> full iSCSI initiator then we might be better off just biting the
> bullet and running a proper kernel there with either kexec, or some
> manner of domU respawn, to boot the correct kernel/initrd once the
> pre-boot one has downloaded them. Either that, or we basically need
> to have special cases for /boot to make sure that those files, plus
> the grub.conf-type kernel args, are registered elsewhere (directory?)
> for the dom0 to get them from.
>
> --Stephen
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
--
Philip R. Auld, Ph.D. Egenera, Inc.
Software Architect 165 Forest St.
(508) 858-2628 Marlboro, MA 01752
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] Re: [PATCH 0/3] domUloader
2006-01-19 13:06 ` Tim Deegan
2006-01-20 12:43 ` Kurt Garloff
@ 2006-01-23 13:39 ` Tim Deegan
1 sibling, 0 replies; 30+ messages in thread
From: Tim Deegan @ 2006-01-23 13:39 UTC (permalink / raw)
To: xen-devel
[-- Attachment #1: Type: text/plain, Size: 831 bytes --]
On Thu, Jan 19, 2006 at 01:06:18PM +0000, Tim Deegan wrote:
> The partition handling is only enough to find the "active" partition, so
> it doesn't handle extended partitions. That's not a problem for pygrub,
> but would need to be done to have the extraction tool handle partitions
> properly.
A new version of my patch is attached that understands DOS partition
tables (so long as your libext2fs libraries are up to date.)
I do prefer Kurt's changes to the xm create "bootloader" syntax. They
are cleaner than what I've done with pyfscat. I'd like to do a merge of
the pygrub extraction code with his bootloader structure, but haven't
time to do it properly at the moment.
Tim.
--
Tim Deegan (My opinions, not the University's)
Systems Research Group
University of Cambridge Computer Laboratory
[-- Attachment #2: guest.patch --]
[-- Type: text/x-patch, Size: 8835 bytes --]
# HG changeset patch
# User tjd21@labyrinth.cl.cam.ac.uk
# Node ID e35449fc66de226857fbf7946b2cccadbff23dd7
# Parent c4ae9456a4595f046f08aea2f2e7b3664b50ab82
Added "guest:(dev)/path/to/file" syntax to xm create filenames
Added pyfscat script to pygrub, needed for guest: syntax
Signed-off-by: Tim Deegan <tjd21@cam.ac.uk>
diff -r c4ae9456a459 -r e35449fc66de tools/pygrub/setup.py
--- a/tools/pygrub/setup.py Fri Jan 20 19:31:09 2006
+++ b/tools/pygrub/setup.py Mon Jan 23 13:31:25 2006
@@ -43,7 +43,7 @@
author_email='katzj@redhat.com',
license='GPL',
package_dir={'grub': 'src'},
- scripts = ["src/pygrub"],
+ scripts = ["src/pygrub", "src/pyfscat"],
packages=pkgs,
ext_modules = fsys_mods
)
diff -r c4ae9456a459 -r e35449fc66de tools/python/xen/xm/create.py
--- a/tools/python/xen/xm/create.py Fri Jan 20 19:31:09 2006
+++ b/tools/python/xen/xm/create.py Mon Jan 23 13:31:25 2006
@@ -26,6 +26,8 @@
import socket
import commands
import time
+import re
+import tempfile
import xen.lowlevel.xc
@@ -396,6 +398,9 @@
fn=set_value, default=None,
use="X11 Authority to use")
+gopts.var('delete_after', val='FILES',
+ fn=append_value, default=[],
+ use="Files to delete after domain starts")
def err(msg):
"""Print an error to stderr and exit.
@@ -418,13 +423,58 @@
else:
return s
+def file_from_guest(file, vals):
+ """Extract a file from one of the guest disk images.
+ """
+ if len(vals.disk) < 1:
+ err("No disks configured and guest-fs file requested")
+ m = re.match('^\(([^)]+)\)(.*)$', file)
+ partno = 0
+ if m != None:
+ dev = m.group(1)
+ file = m.group(2)
+ m = re.match('^([^0-9]+)([0-9]+)$', dev)
+ if m != None:
+ dev = m.group(1)
+ partno = int(m.group(2))
+ dev_uname = None
+ for disk in vals.disk:
+ (uname, vdev, _, _) = disk
+ if dev == vdev:
+ dev_uname = uname
+ break
+ if dev_uname == None:
+ err("Can't find guest block dev %s to extract %s" % (dev, file))
+ else:
+ (dev_uname, _, _, _) = vals.disk[0]
+ dev_file = blkif.blkdev_uname_to_file(dev_uname)
+ if dev_file == None:
+ err("Can't find guest disk image %s to extract %s" % (dev_uname, file))
+ (tfd, fname) = tempfile.mkstemp()
+ os.close(tfd)
+ rv = os.system("pyfscat -q -p %i %s %s %s"
+ % (partno, dev_file, file, fname))
+ if rv != 0:
+ os.unlink(fname)
+ err("Can't extract %s from guest disk image %s" % (file, dev_uname))
+ vals.delete_after.append(fname)
+ return fname
+
+def parse_filename(file, vals):
+ """Parse filename for 'guest:' format
+ """
+ if file.startswith("guest:"):
+ return file_from_guest(file[6:], vals)
+ else:
+ return os.path.abspath(file)
+
def configure_image(vals):
"""Create the image config.
"""
config_image = [ vals.builder ]
- config_image.append([ 'kernel', os.path.abspath(vals.kernel) ])
+ config_image.append([ 'kernel', parse_filename(vals.kernel, vals) ])
if vals.ramdisk:
- config_image.append([ 'ramdisk', os.path.abspath(vals.ramdisk) ])
+ config_image.append([ 'ramdisk', parse_filename(vals.ramdisk, vals) ])
if vals.cmdline_ip:
cmdline_ip = strip('ip=', vals.cmdline_ip)
config_image.append(['ip', cmdline_ip])
@@ -795,6 +845,10 @@
server.xend_domain_destroy(dom)
err("Failed to unpause domain %s" % dom)
opts.info("Started domain %s" % (dom))
+
+ for file in opts.vals.delete_after:
+ os.unlink(file)
+
return int(sxp.child_value(dominfo, 'domid'))
def parseCommandLine(argv):
diff -r c4ae9456a459 -r e35449fc66de tools/pygrub/src/pyfscat
--- /dev/null Fri Jan 20 19:31:09 2006
+++ b/tools/pygrub/src/pyfscat Mon Jan 23 13:31:25 2006
@@ -0,0 +1,169 @@
+#!/usr/bin/python
+#
+# pyfscat - extract a file's contents from a filesystem image.
+#
+# Copyright (C) 2005 XenSource Ltd
+#
+# Based on pygrub, Copyright 2005 Red Hat, Inc.
+# Jeremy Katz <katzj@redhat.com>
+#
+# This software may be freely redistributed under the terms of the GNU
+# general public license.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+#
+
+import os, sys, struct, getopt
+
+sys.path = [ '/usr/lib/python' ] + sys.path
+
+import grub.fsys
+
+SECTOR_SIZE=512
+
+def get_part_offset(dev_path, quiet, partno):
+ """Extract a partition offset from a DOS-style partition table.
+ """
+ if partno < 0:
+ return -1
+ if partno == 0:
+ return 0
+ try:
+ fd = os.open(dev_path, os.O_RDONLY)
+ buf = os.read(fd, 512)
+ if (len(buf) < 512
+ or struct.unpack("H", buf[0x1fe: 0x200]) != (0xaa55,)):
+ if not quiet:
+ print >> sys.stderr, ("No partition table in %s"
+ % (dev_path))
+ return -1
+ if partno < 5:
+ poff = 430 + 16 * partno
+ offset = (struct.unpack("<L", buf[poff+8:poff+12])[0]
+ * SECTOR_SIZE)
+ return offset
+
+ # Find the extended partition table
+ xoff = 0
+ for poff in (446, 462, 478, 494): # partition offsets
+ sysind = struct.unpack("<c", buf[poff+4:poff+5])[0]
+ if sysind in ('\x05', '\x85', '\x0f'): # types of extended partn
+ xoff = struct.unpack("<L", buf[poff+8:poff+12])[0]*SECTOR_SIZE
+ break
+ if xoff == 0:
+ return -1
+
+ # Walk the extended partitions
+ toff = 0
+ p = 5
+ while True:
+ os.lseek(fd, xoff + toff, 0)
+ buf = os.read(fd, 512)
+ if (len(buf) < 512
+ or struct.unpack("H", buf[0x1fe: 0x200]) != (0xaa55,)):
+ return -1
+ if (p == partno):
+ return ((struct.unpack("<L", buf[446+8:446+12])[0]
+ * SECTOR_SIZE) + xoff + toff)
+ p += 1
+ # Find an extended partition type, which is the next in the chain
+ toff = 0
+ for poff in (446, 462, 478, 494): # partition offsets
+ sysind = struct.unpack("<c", buf[poff+4:poff+5])[0]
+ if sysind in ('\x05', '\x85', '\x0f'): # extended?
+ toff = struct.unpack("<L",
+ buf[poff+8:poff+12])[0] * SECTOR_SIZE
+ break
+ if toff == 0:
+ return -1
+ except OSError:
+ return -1
+
+
+def file_from_fs_image(dev_path, path, out_path, quiet, partno):
+ """Extract a file from a filesystem image using pygrub's libraries.
+ """
+
+ # Read partition tables if we need to
+ offset = get_part_offset(dev_path, quiet, partno)
+ if offset < 0:
+ if not quiet:
+ print >> sys.stderr, ("Can't find partition %i in %s"
+ % (partno, dev_path))
+ return False
+
+ # Now, call the pygrub libraries to read the file.
+ fs = None
+ for fstype in grub.fsys.fstypes.values():
+ if fstype.sniff_magic(dev_path, offset):
+ fs = fstype.open_fs(dev_path, offset)
+ break
+ if fs is None:
+ if not quiet:
+ print >> sys.stderr, ("Can't open a filesystem image in %s"
+ % (dev_path))
+ return False
+ if not fs.file_exist(path):
+ if not quiet:
+ print >> sys.stderr, ("Can't find file %s in %s"
+ % (path, dev_path))
+ return False
+ fd = fs.open_file(path, os.O_RDONLY)
+ contents = fd.read()
+
+ # We have something to write: fix up somewhere to put it
+ if out_path == "-":
+ sys.stdout.write(contents)
+ else:
+ try:
+ out_file = open(out_path, 'w')
+ out_file.write(contents)
+ out_file.close()
+ except IOError:
+ if not quiet:
+ print >> sys.stderr, "Can't write to %s" % (out_path)
+ return False
+ return True
+
+def usage():
+ print >> sys.stderr, ("Usage: %s [-q] [-p n] <image> <path> <dest-file>\n"
+ "\n"
+ "Extracts file <path> from fs image <image> "
+ "to <dest-file>\n"
+ " -q Don't print error messages\n"
+ " -p n Use partition number n (in a disk image)\n"
+ % (sys.argv[0],))
+
+
+try:
+ opts, args = getopt.getopt(sys.argv[1:], 'qhp:',
+ ["quiet", "help", "partno"])
+except getopt.GetoptError:
+ usage()
+ sys.exit(1)
+if len(args) != 3:
+ usage()
+ sys.exit(1)
+
+quiet = False
+partno = 0
+for o, a in opts:
+ if o in ("-q", "--quiet"):
+ quiet = True
+ elif o in ("-h", "--help"):
+ usage()
+ sys.exit()
+ elif o in ("-p", "--partno"):
+ try:
+ partno = int(a)
+ except ValueError:
+ usage()
+ sys.exit(1)
+ if partno < 0:
+ usage()
+ sys.exit(1)
+
+if not file_from_fs_image(args[0], args[1], args[2], quiet, partno):
+ sys.exit(1)
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-20 23:08 ` Philip R. Auld
@ 2006-01-23 14:19 ` Kurt Garloff
2006-01-23 14:59 ` Philip R. Auld
0 siblings, 1 reply; 30+ messages in thread
From: Kurt Garloff @ 2006-01-23 14:19 UTC (permalink / raw)
To: Philip R. Auld; +Cc: Xen development list, Jeremy Katz
[-- Attachment #1.1: Type: text/plain, Size: 468 bytes --]
Hi,
On Fri, Jan 20, 2006 at 06:08:26PM -0500, Philip R. Auld wrote:
> Of course, that would mean support for root on iSCSI in the installer
> and mkinitrd code.
I would really handle iSCSI in dom0 and export sdX/hdX to domU.
Solves the nasty OOM problem as well and makes your domains know
less about the underlaying storage. Which is good in my vision
of virtualization.
Best,
--
Kurt Garloff, Head Architect, Director SUSE Labs (act.), Novell Inc.
[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-23 14:19 ` Kurt Garloff
@ 2006-01-23 14:59 ` Philip R. Auld
0 siblings, 0 replies; 30+ messages in thread
From: Philip R. Auld @ 2006-01-23 14:59 UTC (permalink / raw)
To: Kurt Garloff, Stephen Tweedie, Jeremy Katz, Xen development list
Rumor has it that on Mon, Jan 23, 2006 at 03:19:50PM +0100 Kurt Garloff said:
> Hi,
>
> On Fri, Jan 20, 2006 at 06:08:26PM -0500, Philip R. Auld wrote:
> > Of course, that would mean support for root on iSCSI in the installer
> > and mkinitrd code.
>
> I would really handle iSCSI in dom0 and export sdX/hdX to domU.
> Solves the nasty OOM problem as well and makes your domains know
> less about the underlaying storage. Which is good in my vision
> of virtualization.
That's how I would do it too, if I was using iSCSI. Actually that
is how I do it with FC, and in fact dom0 doesn't know about it either.
I was responding to Stephen's comments about the requirements for
a pre-boot image.
Cheers,
Phil
>
> Best,
> --
> Kurt Garloff, Head Architect, Director SUSE Labs (act.), Novell Inc.
--
Philip R. Auld, Ph.D. Egenera, Inc.
Software Architect 165 Forest St.
(508) 858-2628 Marlboro, MA 01752
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: [PATCH 0/3] domUloader
@ 2006-01-26 10:17 Edwards, Nigel (Nigel Edwards)
2006-01-26 13:06 ` Mark Williamson
2006-01-26 13:37 ` Philip R. Auld
0 siblings, 2 replies; 30+ messages in thread
From: Edwards, Nigel (Nigel Edwards) @ 2006-01-26 10:17 UTC (permalink / raw)
To: Kurt Garloff, Philip R. Auld; +Cc: Jeremy Katz, Xen development list
>
> I would really handle iSCSI in dom0 and export sdX/hdX to domU.
> Solves the nasty OOM problem as well and makes your domains know
> less about the underlaying storage. Which is good in my vision
> of virtualization.
>
There are good arguments for isolating storage management in Dom0.
However, I have been looking at migration of domains with iSCSI
and have found it a pain to fix up so that the iSCSI disk
appears on the same /dev/sdx point in both source and destination
Dom0s.
That is why I have started looking at direct iSCSI attachment in DomU
via initrd. Then storage is fixed up automatically via migration of
network connections. The disadvantage of this, as Kurt points out,
is that it exposes your iSCSI infrastructure into DomU.
Cheers,
Nigel.
> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com
> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of
> Kurt Garloff
> Sent: 23 January 2006 14:20
> To: Philip R. Auld
> Cc: Xen development list; Jeremy Katz
> Subject: Re: [Xen-devel] [PATCH 0/3] domUloader
>
>
> Hi,
>
> On Fri, Jan 20, 2006 at 06:08:26PM -0500, Philip R. Auld wrote:
> > Of course, that would mean support for root on iSCSI in the
> installer
> > and mkinitrd code.
>
> I would really handle iSCSI in dom0 and export sdX/hdX to domU.
> Solves the nasty OOM problem as well and makes your domains know
> less about the underlaying storage. Which is good in my vision
> of virtualization.
>
> Best,
> --
> Kurt Garloff, Head Architect, Director SUSE Labs (act.), Novell Inc.
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-26 10:17 Edwards, Nigel (Nigel Edwards)
@ 2006-01-26 13:06 ` Mark Williamson
2006-01-26 13:37 ` Philip R. Auld
1 sibling, 0 replies; 30+ messages in thread
From: Mark Williamson @ 2006-01-26 13:06 UTC (permalink / raw)
To: xen-devel
Cc: Edwards, Nigel (Nigel Edwards), Jeremy Katz, Philip R. Auld,
Kurt Garloff
> There are good arguments for isolating storage management in Dom0.
> However, I have been looking at migration of domains with iSCSI
> and have found it a pain to fix up so that the iSCSI disk
> appears on the same /dev/sdx point in both source and destination
> Dom0s.
If you wrote a setup script for iSCSI devices (as we have for NBD, etc) you
could do some sort of lookup to identify the correct device node when the
domain arrives at a new host - then you wouldn't need for the device node to
be the same everywhere.
Don't know terribly much about iSCSI, but you'd just need a device-node
independent way of specifying the LUN as the block device in the config file.
Cheers,
Mark
> That is why I have started looking at direct iSCSI attachment in DomU
> via initrd. Then storage is fixed up automatically via migration of
> network connections. The disadvantage of this, as Kurt points out,
> is that it exposes your iSCSI infrastructure into DomU.
>
> Cheers,
> Nigel.
>
> > -----Original Message-----
> > From: xen-devel-bounces@lists.xensource.com
> > [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of
> > Kurt Garloff
> > Sent: 23 January 2006 14:20
> > To: Philip R. Auld
> > Cc: Xen development list; Jeremy Katz
> > Subject: Re: [Xen-devel] [PATCH 0/3] domUloader
> >
> >
> > Hi,
> >
> > On Fri, Jan 20, 2006 at 06:08:26PM -0500, Philip R. Auld wrote:
> > > Of course, that would mean support for root on iSCSI in the
> >
> > installer
> >
> > > and mkinitrd code.
> >
> > I would really handle iSCSI in dom0 and export sdX/hdX to domU.
> > Solves the nasty OOM problem as well and makes your domains know
> > less about the underlaying storage. Which is good in my vision
> > of virtualization.
> >
> > Best,
> > --
> > Kurt Garloff, Head Architect, Director SUSE Labs (act.), Novell Inc.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
--
Dave: Just a question. What use is a unicyle with no seat? And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-26 10:17 Edwards, Nigel (Nigel Edwards)
2006-01-26 13:06 ` Mark Williamson
@ 2006-01-26 13:37 ` Philip R. Auld
2006-01-26 14:01 ` Ian Campbell
1 sibling, 1 reply; 30+ messages in thread
From: Philip R. Auld @ 2006-01-26 13:37 UTC (permalink / raw)
To: Edwards, Nigel (Nigel Edwards)
Cc: Jeremy Katz, Xen development list, Kurt Garloff
Rumor has it that on Thu, Jan 26, 2006 at 10:17:22AM -0000 Edwards, Nigel (Nigel Edwards) said:
> >
> > I would really handle iSCSI in dom0 and export sdX/hdX to domU.
> > Solves the nasty OOM problem as well and makes your domains know
> > less about the underlaying storage. Which is good in my vision
> > of virtualization.
> >
> There are good arguments for isolating storage management in Dom0.
> However, I have been looking at migration of domains with iSCSI
> and have found it a pain to fix up so that the iSCSI disk
> appears on the same /dev/sdx point in both source and destination
> Dom0s.
It should be possible to use udev to assign a specific unique name
to each disk based on UID or whatever iSCSI has for WWN support.
I think that would be the right fix. The order based /dev/sdX naming
has been a deficiency in Linux proper for ever, but should be going
away at this point.
Cheers,
Phil
>
> That is why I have started looking at direct iSCSI attachment in DomU
> via initrd. Then storage is fixed up automatically via migration of
> network connections. The disadvantage of this, as Kurt points out,
> is that it exposes your iSCSI infrastructure into DomU.
>
> Cheers,
> Nigel.
>
> > -----Original Message-----
> > From: xen-devel-bounces@lists.xensource.com
> > [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of
> > Kurt Garloff
> > Sent: 23 January 2006 14:20
> > To: Philip R. Auld
> > Cc: Xen development list; Jeremy Katz
> > Subject: Re: [Xen-devel] [PATCH 0/3] domUloader
> >
> >
> > Hi,
> >
> > On Fri, Jan 20, 2006 at 06:08:26PM -0500, Philip R. Auld wrote:
> > > Of course, that would mean support for root on iSCSI in the
> > installer
> > > and mkinitrd code.
> >
> > I would really handle iSCSI in dom0 and export sdX/hdX to domU.
> > Solves the nasty OOM problem as well and makes your domains know
> > less about the underlaying storage. Which is good in my vision
> > of virtualization.
> >
> > Best,
> > --
> > Kurt Garloff, Head Architect, Director SUSE Labs (act.), Novell Inc.
> >
--
Philip R. Auld, Ph.D. Egenera, Inc.
Software Architect 165 Forest St.
(508) 858-2628 Marlboro, MA 01752
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-26 13:37 ` Philip R. Auld
@ 2006-01-26 14:01 ` Ian Campbell
2006-01-26 14:20 ` Philip R. Auld
0 siblings, 1 reply; 30+ messages in thread
From: Ian Campbell @ 2006-01-26 14:01 UTC (permalink / raw)
To: Philip R. Auld
Cc: Edwards, Nigel (Nigel Edwards), Jeremy Katz, Xen development list,
Kurt Garloff
On Thu, 2006-01-26 at 08:37 -0500, Philip R. Auld wrote:
> It should be possible to use udev to assign a specific unique name
> to each disk based
It already does this to some degree. From a handy Debian box:
$ tree /dev/disk/
/dev/disk/
|-- by-id
| |-- scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator -> ../../sda
| |-- scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator-part1 -> ../../sda1
| |-- scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator-part2 -> ../../sda2
| `-- scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator-part3 -> ../../sda3
|-- by-label
| `-- boot -> ../../sda1
|-- by-path
| |-- pci-0000:00:1f.1-ide-0:0 -> ../../hda
| |-- pci-0000:00:1f.2-scsi-0:0:0:0 -> ../../sda
| |-- pci-0000:00:1f.2-scsi-0:0:0:0-part1 -> ../../sda1
| |-- pci-0000:00:1f.2-scsi-0:0:0:0-part2 -> ../../sda2
| `-- pci-0000:00:1f.2-scsi-0:0:0:0-part3 -> ../../sda3
`-- by-uuid
`-- 8312472d-e311-4e0d-837c-6c4eb646a5e3 -> ../../sda1
The logic is in /etc/udev/persistent.rules. It might need updating with
some iSCSI knowledge I suppose.
Ian.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-26 14:01 ` Ian Campbell
@ 2006-01-26 14:20 ` Philip R. Auld
2006-01-26 18:28 ` Kurt Garloff
0 siblings, 1 reply; 30+ messages in thread
From: Philip R. Auld @ 2006-01-26 14:20 UTC (permalink / raw)
To: Ian Campbell
Cc: Edwards, Nigel (Nigel Edwards), Jeremy Katz, Xen development list,
Kurt Garloff
Rumor has it that on Thu, Jan 26, 2006 at 02:01:38PM +0000 Ian Campbell said:
> On Thu, 2006-01-26 at 08:37 -0500, Philip R. Auld wrote:
> > It should be possible to use udev to assign a specific unique name
> > to each disk based
>
> It already does this to some degree. From a handy Debian box:
>
> $ tree /dev/disk/
> /dev/disk/
> |-- by-id
> | |-- scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator -> ../../sda
> | |-- scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator-part1 -> ../../sda1
> | |-- scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator-part2 -> ../../sda2
> | `-- scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator-part3 -> ../../sda3
> |-- by-label
> | `-- boot -> ../../sda1
> |-- by-path
> | |-- pci-0000:00:1f.1-ide-0:0 -> ../../hda
> | |-- pci-0000:00:1f.2-scsi-0:0:0:0 -> ../../sda
> | |-- pci-0000:00:1f.2-scsi-0:0:0:0-part1 -> ../../sda1
> | |-- pci-0000:00:1f.2-scsi-0:0:0:0-part2 -> ../../sda2
> | `-- pci-0000:00:1f.2-scsi-0:0:0:0-part3 -> ../../sda3
> `-- by-uuid
> `-- 8312472d-e311-4e0d-837c-6c4eb646a5e3 -> ../../sda1
>
> The logic is in /etc/udev/persistent.rules. It might need updating with
> some iSCSI knowledge I suppose.
Right, but this is just showing which UID it mapped to sda. My point
was you can configure it to give a specific name to a specific UID:
scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator -> ../../my_domU_disk
Then you use my_domU_disk in the xen domain configuration.
You can do this manually as well by looking up which sdX name the
deivce got and making a link it or device node with the same
major/minor. But I think udev can be configured to to it for you.
I don't know the details of how to make it do that, but that's part
of what it's for.
Cheers,
Phil
>
> Ian.
>
--
Philip R. Auld, Ph.D. Egenera, Inc.
Software Architect 165 Forest St.
(508) 858-2628 Marlboro, MA 01752
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-26 14:20 ` Philip R. Auld
@ 2006-01-26 18:28 ` Kurt Garloff
2006-01-26 18:31 ` Mark Williamson
2006-01-26 18:57 ` Philip R. Auld
0 siblings, 2 replies; 30+ messages in thread
From: Kurt Garloff @ 2006-01-26 18:28 UTC (permalink / raw)
To: Philip R. Auld
Cc: Edwards, Nigel (Nigel Edwards), Jeremy Katz, Xen development list,
Ian Campbell
[-- Attachment #1.1: Type: text/plain, Size: 1116 bytes --]
Hi Philip,
On Thu, Jan 26, 2006 at 09:20:40AM -0500, Philip R. Auld wrote:
> Right, but this is just showing which UID it mapped to sda. My point
> was you can configure it to give a specific name to a specific UID:
>
> scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator -> ../../my_domU_diska
>
> Then you use my_domU_disk in the xen domain configuration.
> You can do this manually as well by looking up which sdX name the
> deivce got and making a link it or device node with the same
> major/minor. But I think udev can be configured to to it for you.
> I don't know the details of how to make it do that, but that's part
> of what it's for.
Use
disk = [ 'iscsi:scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator,hda,w' ]
and drop a script into /etc/xen/scripts/block-iscsi
that handles it. (I'm assuming now that scsi-0ATA_ST... is a unique
identifier here; otherwise chose a different property of your iSCSI
target.)
Unless I misunderstood something, that suggestion of Mark really
solves your problem.
Best,
--
Kurt Garloff, Head Architect, Director SUSE Labs (act.), Novell Inc.
[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-26 18:28 ` Kurt Garloff
@ 2006-01-26 18:31 ` Mark Williamson
2006-01-26 18:57 ` Philip R. Auld
1 sibling, 0 replies; 30+ messages in thread
From: Mark Williamson @ 2006-01-26 18:31 UTC (permalink / raw)
To: xen-devel
Cc: Edwards, Nigel (Nigel Edwards), Jeremy Katz, Philip R. Auld,
Kurt Garloff, Ian Campbell
Oh, and if you need help with the script people here can probably advise.
It's relatively simple - see the NBD scripting.
If it has more general applicability, it'd be nice to see a copy on xen-devel
if you get the chance ;-)
Cheers,
Mark
On Thursday 26 January 2006 18:28, Kurt Garloff wrote:
> Hi Philip,
>
> On Thu, Jan 26, 2006 at 09:20:40AM -0500, Philip R. Auld wrote:
> > Right, but this is just showing which UID it mapped to sda. My point
> > was you can configure it to give a specific name to a specific UID:
> >
> > scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator -> ../../my_domU_diska
> >
> > Then you use my_domU_disk in the xen domain configuration.
> > You can do this manually as well by looking up which sdX name the
> > deivce got and making a link it or device node with the same
> > major/minor. But I think udev can be configured to to it for you.
> > I don't know the details of how to make it do that, but that's part
> > of what it's for.
>
> Use
> disk = [ 'iscsi:scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator,hda,w' ]
> and drop a script into /etc/xen/scripts/block-iscsi
> that handles it. (I'm assuming now that scsi-0ATA_ST... is a unique
> identifier here; otherwise chose a different property of your iSCSI
> target.)
>
> Unless I misunderstood something, that suggestion of Mark really
> solves your problem.
>
> Best,
--
Dave: Just a question. What use is a unicyle with no seat? And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-26 18:28 ` Kurt Garloff
2006-01-26 18:31 ` Mark Williamson
@ 2006-01-26 18:57 ` Philip R. Auld
2006-01-26 19:14 ` Kurt Garloff
1 sibling, 1 reply; 30+ messages in thread
From: Philip R. Auld @ 2006-01-26 18:57 UTC (permalink / raw)
To: Kurt Garloff, Ian Campbell, Edwards, Nigel (Nigel Edwards),
Jeremy Katz, Xen development list
Hi Kurt,
Rumor has it that on Thu, Jan 26, 2006 at 07:28:53PM +0100 Kurt Garloff said:
> Hi Philip,
>
> On Thu, Jan 26, 2006 at 09:20:40AM -0500, Philip R. Auld wrote:
> > Right, but this is just showing which UID it mapped to sda. My point
> > was you can configure it to give a specific name to a specific UID:
> >
> > scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator -> ../../my_domU_diska
> >
> > Then you use my_domU_disk in the xen domain configuration.
> > You can do this manually as well by looking up which sdX name the
> > deivce got and making a link it or device node with the same
> > major/minor. But I think udev can be configured to to it for you.
> > I don't know the details of how to make it do that, but that's part
> > of what it's for.
>
> Use
> disk = [ 'iscsi:scsi-0ATA_ST3200826AS_Linux_ATA-SCSI_simulator,hda,w' ]
> and drop a script into /etc/xen/scripts/block-iscsi
> that handles it. (I'm assuming now that scsi-0ATA_ST... is a unique
> identifier here; otherwise chose a different property of your iSCSI
> target.)
>
> Unless I misunderstood something, that suggestion of Mark really
> solves your problem.
>
Thanks. But I don't have a problem (at least not this anyway ;).
We're both talking about different ways to do the same thing.
My point again is that udev can handle the task of creating
well-known, consistent device names for uniquely identifiable
disks. And that it makes sense to me to just configure udev
properly and you do not need to do anything xen specific to
solve this problem. And it is not iSCSI specific either.
As I said, I don't know for sure how to make udev do
this for iSCSI. For FC you can use the scsi_id callout to
get the UID and then have a udev rule to use to generate a
device name. There may not exist an analog of scsi_id for
iSCSI. It's the same as what you are suggesting except it's
Linux-wide and not xen only :)
Cheers,
Phil
> Best,
> --
> Kurt Garloff, Head Architect, Director SUSE Labs (act.), Novell Inc.
--
Philip R. Auld, Ph.D. Egenera, Inc.
Software Architect 165 Forest St.
(508) 858-2628 Marlboro, MA 01752
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-26 18:57 ` Philip R. Auld
@ 2006-01-26 19:14 ` Kurt Garloff
0 siblings, 0 replies; 30+ messages in thread
From: Kurt Garloff @ 2006-01-26 19:14 UTC (permalink / raw)
To: Philip R. Auld
Cc: Edwards, Nigel (Nigel Edwards), Jeremy Katz, Xen development list,
Ian Campbell
[-- Attachment #1.1: Type: text/plain, Size: 1763 bytes --]
Hi Philip,
On Thu, Jan 26, 2006 at 01:57:35PM -0500, Philip R. Auld wrote:
> Thanks. But I don't have a problem (at least not this anyway ;).
>
> We're both talking about different ways to do the same thing.
> My point again is that udev can handle the task of creating
> well-known, consistent device names for uniquely identifiable
> disks. And that it makes sense to me to just configure udev
> properly and you do not need to do anything xen specific to
> solve this problem. And it is not iSCSI specific either.
... and you just use phy:by-id/name as for disk exporting then.
Sure, that would work.
The block-iscsi script has the advantage that it's possible
to connect to iSCSI target only when needed and disconnect
once you are done. This helps you when migrating VMs from
one physical machine to another. I don't know exactly the
semantics of multiple parallel connections to an iSCSI target.
Probably it should not cause problems, in which case you'd
get away with your phy: solution.
> As I said, I don't know for sure how to make udev do
> this for iSCSI. For FC you can use the scsi_id callout to
> get the UID and then have a udev rule to use to generate a
> device name. There may not exist an analog of scsi_id for
> iSCSI. It's the same as what you are suggesting except it's
> Linux-wide and not xen only :)
You're trying to solve the persistant device naming problem.
Mark and I are trying to connect and disconnect to iSCSI
targets on demand (and have the possibility to solve the
persistent device name problem along the way for xen
-- if needed.)
Anyway, we're far off-topic the original post now.
Best,
--
Kurt Garloff, Head Architect, Director SUSE Labs (act.), Novell Inc.
[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-01-16 23:43 [PATCH 0/3] domUloader Kurt Garloff
2006-01-17 11:52 ` Anthony Liguori
2006-01-17 12:33 ` [PATCH] " Tim Deegan
@ 2006-03-22 18:59 ` Matt Ayres
2006-03-22 22:01 ` Kurt Garloff
2 siblings, 1 reply; 30+ messages in thread
From: Matt Ayres @ 2006-03-22 18:59 UTC (permalink / raw)
To: Kurt Garloff, Xen development list
Kurt Garloff wrote:
>
> I extended the infrastructure a bit and added another bootloader.
> Unlike pygrub it does not offer a menu and does not parse the
> grub menu.lst; it's meant for paravirtualized domains and thus
> we accept that the booted kernel is selected differently. By e.g.
> a symlink, if one wants to control it from the domU. The bootloader
> is called domUloader.
>
Is there a project page for this our tarball / RPM out there that
includes domUloader and it's latest updates?
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-03-22 18:59 ` Matt Ayres
@ 2006-03-22 22:01 ` Kurt Garloff
2006-04-17 19:56 ` Matt Ayres
0 siblings, 1 reply; 30+ messages in thread
From: Kurt Garloff @ 2006-03-22 22:01 UTC (permalink / raw)
To: Matt Ayres; +Cc: Xen development list
[-- Attachment #1.1: Type: text/plain, Size: 941 bytes --]
Hi Matt,
On Wed, Mar 22, 2006 at 01:59:23PM -0500, Matt Ayres wrote:
> Kurt Garloff wrote:
>
> >I extended the infrastructure a bit and added another bootloader.
> >Unlike pygrub it does not offer a menu and does not parse the
> >grub menu.lst; it's meant for paravirtualized domains and thus
> >we accept that the booted kernel is selected differently. By e.g.
> >a symlink, if one wants to control it from the domU. The bootloader
> >is called domUloader.
>
> Is there a project page for this our tarball / RPM out there that
> includes domUloader and it's latest updates?
I was hoping to see it merged right away, so I did not set up a project
page. Seems I have to do it :-(
The Novell/SUSE RPMs include the domUloader functionality.
You can find latest version in our current betas or on
http://forge.novell.com/modules/xfmod/project/?xenpreview
Best,
--
Kurt Garloff, Head Architect Linux, Novell Inc.
[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/3] domUloader
2006-03-22 22:01 ` Kurt Garloff
@ 2006-04-17 19:56 ` Matt Ayres
0 siblings, 0 replies; 30+ messages in thread
From: Matt Ayres @ 2006-04-17 19:56 UTC (permalink / raw)
To: Kurt Garloff, Xen development list
Kurt Garloff wrote:
> Hi Matt,
>
> On Wed, Mar 22, 2006 at 01:59:23PM -0500, Matt Ayres wrote:
>> Kurt Garloff wrote:
>>
>>> I extended the infrastructure a bit and added another bootloader.
>>> Unlike pygrub it does not offer a menu and does not parse the
>>> grub menu.lst; it's meant for paravirtualized domains and thus
>>> we accept that the booted kernel is selected differently. By e.g.
>>> a symlink, if one wants to control it from the domU. The bootloader
>>> is called domUloader.
>> Is there a project page for this our tarball / RPM out there that
>> includes domUloader and it's latest updates?
>
> I was hoping to see it merged right away, so I did not set up a project
> page. Seems I have to do it :-(
>
> The Novell/SUSE RPMs include the domUloader functionality.
> You can find latest version in our current betas or on
> http://forge.novell.com/modules/xfmod/project/?xenpreview
>
Hi Kurt,
Are there any plans to re-diff domUloader against 3.0.2 and post to this
list? I've been having quite a difficult time to get your original
patches to apply.
Thanks,
Matt
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2006-04-17 19:56 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-01-16 23:43 [PATCH 0/3] domUloader Kurt Garloff
2006-01-17 11:52 ` Anthony Liguori
2006-01-17 14:34 ` Kurt Garloff
2006-01-17 17:28 ` Adam Heath
2006-01-17 21:28 ` Kurt Garloff
2006-01-17 21:41 ` Anthony Liguori
2006-01-18 18:06 ` Jeremy Katz
2006-01-18 23:21 ` Kurt Garloff
2006-01-19 4:31 ` Anthony Liguori
2006-01-19 17:19 ` Jeremy Katz
2006-01-20 20:36 ` Stephen Tweedie
2006-01-20 23:08 ` Philip R. Auld
2006-01-23 14:19 ` Kurt Garloff
2006-01-23 14:59 ` Philip R. Auld
2006-01-17 12:33 ` [PATCH] " Tim Deegan
[not found] ` <1137607621.22846.17.camel@bree.local.net>
2006-01-19 13:06 ` Tim Deegan
2006-01-20 12:43 ` Kurt Garloff
2006-01-23 13:39 ` Tim Deegan
2006-03-22 18:59 ` Matt Ayres
2006-03-22 22:01 ` Kurt Garloff
2006-04-17 19:56 ` Matt Ayres
-- strict thread matches above, loose matches on Subject: below --
2006-01-26 10:17 Edwards, Nigel (Nigel Edwards)
2006-01-26 13:06 ` Mark Williamson
2006-01-26 13:37 ` Philip R. Auld
2006-01-26 14:01 ` Ian Campbell
2006-01-26 14:20 ` Philip R. Auld
2006-01-26 18:28 ` Kurt Garloff
2006-01-26 18:31 ` Mark Williamson
2006-01-26 18:57 ` Philip R. Auld
2006-01-26 19:14 ` Kurt Garloff
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.