From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=59302 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P5aTR-00040C-K5 for qemu-devel@nongnu.org; Tue, 12 Oct 2010 04:44:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P5aTQ-0001n1-Az for qemu-devel@nongnu.org; Tue, 12 Oct 2010 04:44:41 -0400 Received: from mail-qy0-f180.google.com ([209.85.216.180]:65180) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P5aTQ-0001mx-8F for qemu-devel@nongnu.org; Tue, 12 Oct 2010 04:44:40 -0400 Received: by qyk1 with SMTP id 1so4668533qyk.4 for ; Tue, 12 Oct 2010 01:44:39 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20101011170425.GH28008@redhat.com> References: <20101011101855.GA25030@redhat.com> <4CB2F1F0.9010404@nsn.com> <20101011142914.GD28008@redhat.com> <4CB34306.4040501@codemonkey.ws> <20101011170425.GH28008@redhat.com> Date: Tue, 12 Oct 2010 09:44:39 +0100 Message-ID: Subject: Re: [Qemu-devel] [RFC] Passing boot order from qemu to seabios From: Stefan Hajnoczi Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gleb Natapov Cc: Bernhard Kohl , seabios@seabios.org, qemu-devel@nongnu.org On Mon, Oct 11, 2010 at 6:04 PM, Gleb Natapov wrote: > On Mon, Oct 11, 2010 at 12:01:58PM -0500, Anthony Liguori wrote: >> On 10/11/2010 10:52 AM, Stefan Hajnoczi wrote: >> >2010/10/11 Gleb Natapov: >> >>On Mon, Oct 11, 2010 at 01:48:09PM +0100, Stefan Hajnoczi wrote: >> >>>On Mon, Oct 11, 2010 at 12:16 PM, Bernhard Kohl =A0wrote: >> >>>>Am 11.10.2010 12:18, schrieb ext Gleb Natapov: >> >>>>>Currently if VM is started with multiple disks it is almost impossi= ble to >> >>>>>guess which one of them will be used as boot device especially if t= here >> >>>>>is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the o= rder >> >>>>>and without looking into the code you can't tell what the order wil= l >> >>>>>be (and in qemu-kvm if boot=3Don is used it brings even more havoc)= . We >> >>>>>should allow fine-grained control of boot order from qemu command l= ine, >> >>>>>or as a minimum control what device will be used for booting. >> >>>>> >> >>>>>To do that along with inventing syntax to specify boot order on qem= u >> >>>>>command line we need to communicate boot order to seabios via fw_cf= g >> >>>>>interface. For that we need to have a way to unambiguously specify = a >> >>>>>disk from qemu to seabios. =A0PCI bus address is not enough since n= ot all >> >>>>>devices are PCI (do we care about them?) and since one PCI device m= ay >> >>>>>control more then one disk (ATA slave/master, SCSI LUNs). We can do= what >> >>>>>EDD specification does. Describe disk as: >> >>>>> =A0 =A0 bus type (isa/pci), >> >>>>> =A0 =A0 address on a bus (16 bit base address for isa, b/s/f for p= ci) >> >>>>> =A0 =A0 device type (ATA/SCSI/VIRTIO) >> >>>>> =A0 =A0 device path (slave/master for ATA, LUN for SCSI, nothing f= or virtio) >> >>>>> >> >>>>>Will it cover all use cased? Any other ideas? >> >>>>I think this also applies to network booting via gPXE. Usually our V= Ms >> >>>>have 4 NICs, mixed virtio-net and PCI pass-through. 2 of the NICs sh= all >> >>>>be used for booting, even if there are hard disks or floppy disks >> >>>>connected. This scenario is currently almost impossible to configure= . >> >>>Here is a gPXE to support fw_cfg. =A0You can pass gPXE script files f= rom >> >>>the host to gPXE inside the guest. =A0This means you can boot specifi= c >> >>>NICs: >> >>>http://patchwork.ozlabs.org/patch/43777/ >> >>> >> >>>Just wanted to post the link because it is related to the gPXE side o= f >> >>>this discussion. >> >>> >> >>Don't we load gPXE for each NIC and seabios passes PCI device to boot = from >> >>when it invokes one of them? >> >SeaBIOS may do that but gPXE internally just probes all PCI devices. >> >It does not take advantage of the PCI bus/addr/fn that was passed to >> >the option ROM. =A0A gPXE instance will try booting from each available >> >NIC in sequence. >> >> It still registers a BEV entry though, no? Yes. >> Does it at least try to boot from the PCI bus/addr/fn of the >> selected BEV entry? Not directly. It probes all PCI devices and tries them in bus/addr/fn order. If you have two identical NICs and only have the boot ROM on the second NIC, the first NIC will still try to network boot first. Changing this behavior requires stashing away the bus/addr/fn and then using it later in gPXE's startup. It's possible but not implemented today. Stefan