* Re: booting from virtio-blk [not found] ` <47F24ABA.3020404@us.ibm.com> @ 2008-04-01 16:13 ` Hollis Blanchard 2008-04-01 17:05 ` Anthony Liguori 2008-04-01 17:09 ` Anthony Liguori 0 siblings, 2 replies; 8+ messages in thread From: Hollis Blanchard @ 2008-04-01 16:13 UTC (permalink / raw) To: Anthony Liguori Cc: kvm-ppc-devel, Christian Ehrhardt, Rusty Russell, kvm-devel [-- Attachment #1: Type: text/plain, Size: 2579 bytes --] On Tue, 2008-04-01 at 09:46 -0500, Anthony Liguori wrote: > Hollis Blanchard wrote: > > On Tue, 2008-04-01 at 14:08 +0200, Christian Ehrhardt wrote: > >> bash-3.00# cat /proc/partitions > >> major minor #blocks name > >> [...] > >> 254 0 22517998136852480 vda <- ?broken? > >> > > > > My guess is this is run-of-the-mill endianness mismatch. > > 22517998136852480 = 0x00500000_00000000, which 64-bit byteswapped would > > be 0x5000, and that's probably a reasonable number of 512-byte blocks. > > Is your disk image 10MB? > > > > Why would we have a problem, since both guest and host are big-endian? > > Because virtio is a PCI device, and PCI MMIO are LE, so > > __virtio_config_val() in the guest is (correctly) using le64_to_cpu(). > > > > Why didn't we have problems with virtio-net? Because virtio-net doesn't > > seem to have anything interesting in PCI config space. virtio-blk's > > config space contains the capacity and a few other pieces of > > information. > > > > The fix needs to be in qemu, and given the lack of qemu endianness > > infrastructure, I'm afraid it will be a hack. See > > http://svn.savannah.nongnu.org/viewvc/trunk/hw/e1000.c?root=qemu&r1=4046&r2=4045&pathrev=4046 for reference. We all know that TARGET_WORDS_BIGENDIAN is totally wrong, but unfortunately it also seems to be the only (accidentally) working solution in qemu without major IO system rework. :( > > It's actually not so bad since the virtio config space is already read > one byte at a time. The following should help. > > diff --git a/qemu/hw/virtio-blk.c b/qemu/hw/virtio-blk.c > index 0f55d2a..492bd7f 100644 > --- a/qemu/hw/virtio-blk.c > +++ b/qemu/hw/virtio-blk.c > @@ -134,8 +134,8 @@ static void virtio_blk_update_config(VirtIODevice > *vdev, uin > int64_t capacity; > > bdrv_get_geometry(s->bs, &capacity); > - blkcfg.capacity = capacity; > - blkcfg.seg_max = 128 - 2; > + blkcfg.capacity = cpu_to_le64(capacity); > + blkcfg.seg_max = cpu_to_le32(128 - 2); > memcpy(config, &blkcfg, sizeof(blkcfg)); > } Thanks Anthony, you've saved me a lot of debug time! Rusty, doing 64-bit PCI config space accesses with ioread8() definitely violates the principle of least surprises, and would have taken me a long time to track down. :( Attached is a boot log of a PowerPC guest booting from virtio-blk root. "ramdisk_image" is the standard ~4MB image provided with DENX Embedded Linux Development Kit. Booting is also *way* faster than NFS root (a few seconds to get to a shell :) . -- Hollis Blanchard IBM Linux Technology Center [-- Attachment #2: virtio-blk.log --] [-- Type: text/x-log, Size: 5196 bytes --] bash-3.00# ./qemu-system-ppcemb -M bamboo -nographic -kernel ../../uImage.bamboo -L ../pc-bios/ -append "root=/dev/vda rw debug" -net nic,model=virtio -net tap -drive file=/images/ramdisk_image,if=virtio,boot=on bamboo_init: START Ram size passed is: 144 MB Calling function ppc440_init setup mmio setup universal controller trying to setup sdram controller sdram_unmap_bcr: Unmap RAM area 0000000000000000 00400000 sdram_unmap_bcr: Unmap RAM area 0000000000000000 00400000 sdram_set_bcr: Map RAM area 0000000000000000 08000000 sdram_set_bcr: Map RAM area 0000000000000000 01000000 Initializing first serial port ppc405_serial_init: offset 0000000000000300 Done calling ppc440_init bamboo_init: load kernel kernel is at guest address: 0x0 bamboo_init: load device tree file device tree address is at guest address: 0x2b2100 bamboo_init: loading kvm registers bamboo_init: DONE Using Bamboo machine description Linux version 2.6.25-rc3-hg1858cec8eb87-dirty (hollisb@basalt) (gcc version 3.4.2) #152 Tue Apr 1 10:52:01 CDT 2008 Found legacy serial port 0 for /plb/opb/serial@ef600300 mem=ef600300, taddr=ef600300, irq=0, clk=11059200, speed=115200 Found legacy serial port 1 for /plb/opb/serial@ef600400 mem=ef600400, taddr=ef600400, irq=0, clk=11059200, speed=0 console [udbg0] enabled Entering add_active_range(0, 0, 36864) 0 entries of 256 used setup_arch: bootmem arch: exit Top of RAM: 0x9000000, Total RAM: 0x9000000 Memory hole size: 0MB Zone PFN ranges: DMA 0 -> 36864 Normal 36864 -> 36864 Movable zone start PFN for each node early_node_map[1] active PFN ranges 0: 0 -> 36864 On node 0 totalpages: 36864 DMA zone: 288 pages used for memmap DMA zone: 0 pages reserved DMA zone: 36576 pages, LIFO batch:7 Normal zone: 0 pages used for memmap Movable zone: 0 pages used for memmap Built 1 zonelists in Zone order, mobility grouping on. Total pages: 36576 Kernel command line: root=/dev/vda rw debug irq: Allocated host of type 2 @0xc03f3880 UIC0 (32 IRQ sources) at DCR 0xc0 irq: Default host set to @0xc03f3880 PID hash table entries: 1024 (order: 10, 4096 bytes) time_init: decrementer frequency = 666.666660 MHz time_init: processor frequency = 666.666660 MHz clocksource: timebase mult[600000] shift[22] registered clockevent: decrementer mult[aaaa] shift[16] cpu[0] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) Memory: 143060k/147456k available (2632k kernel code, 4252k reserved, 100k data, 125k bss, 132k init) SLUB: Genslabs=10, HWalign=32, Order=0-1, MinObjects=4, CPUs=1, Nodes=1 Calibrating delay loop... 2490.36 BogoMIPS (lpj=4980736) Mount-cache hash table entries: 512 net_namespace: 156 bytes NET: Registered protocol family 16 PCI host bridge /plb/pci@ec000000 (primary) ranges: MEM 0x00000000a0000000..0x00000000bfffffff -> 0x00000000a0000000 IO 0x00000000e8000000..0x00000000e800ffff -> 0x0000000000000000 4xx PCI DMA offset set to 0x00000000 PCI: Probing PCI hardware PCI: Hiding 4xx host bridge resources 0000:00:00.0 irq: irq_create_mapping(0xc03f3880, 0x1c) irq: -> using host @c03f3880 irq: -> obtained virq 28 irq: irq_create_mapping(0xc03f3880, 0x1b) irq: -> using host @c03f3880 irq: -> obtained virq 27 Time: timebase clocksource has been installed. NET: Registered protocol family 2 IP route cache hash table entries: 2048 (order: 1, 8192 bytes) TCP established hash table entries: 8192 (order: 4, 65536 bytes) TCP bind hash table entries: 8192 (order: 3, 32768 bytes) TCP: Hash tables configured (established 8192 bind 8192) TCP reno registered irq: irq_create_mapping(0xc03f3880, 0x0) irq: -> using host @c03f3880 irq: -> obtained virq 16 irq: irq_create_mapping(0xc03f3880, 0x1) irq: -> using host @c03f3880 irq: -> obtained virq 17 io scheduler noop registered io scheduler anticipatory registered (default) io scheduler deadline registered io scheduler cfq registered Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled serial8250.0: ttyS0 at MMIO 0xef600300 (irq = 16) is a 16450 console handover: boot [udbg0] -> real [ttyS0] irq: irq_create_mapping(0xc03f3880, 0x0) irq: -> using host @c03f3880 irq: -> existing mapping on virq 16 ef600300.serial: ttyS0 at MMIO 0xef600300 (irq = 16) is a 16450 irq: irq_create_mapping(0xc03f3880, 0x1) irq: -> using host @c03f3880 irq: -> existing mapping on virq 17 brd: module loaded Intel(R) PRO/1000 Network Driver - version 7.3.20-k2 Copyright (c) 1999-2006 Intel Corporation. pcnet32.c:v1.34 14.Aug.2007 tsbogend@alpha.franken.de tun: Universal TUN/TAP device driver, 1.6 tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com> PCI: Enabling device 0000:00:01.0 (0000 -> 0001) vda: unknown partition table PCI: Enabling device 0000:00:02.0 (0000 -> 0001) TCP cubic registered NET: Registered protocol family 1 NET: Registered protocol family 17 RPC: Registered udp transport module. RPC: Registered tcp transport module. VFS: Mounted root (ext2 filesystem). Freeing unused kernel memory: 132k init root:~> ### Application running ... root:~> ls bin etc home linuxrc sbin usr dev ftp lib proc tmp var root:~> [-- Attachment #3: Type: text/plain, Size: 278 bytes --] ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace [-- Attachment #4: Type: text/plain, Size: 158 bytes --] _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: booting from virtio-blk 2008-04-01 16:13 ` booting from virtio-blk Hollis Blanchard @ 2008-04-01 17:05 ` Anthony Liguori 2008-04-01 17:09 ` Anthony Liguori 1 sibling, 0 replies; 8+ messages in thread From: Anthony Liguori @ 2008-04-01 17:05 UTC (permalink / raw) To: Hollis Blanchard Cc: kvm-ppc-devel, Christian Ehrhardt, Rusty Russell, kvm-devel Hollis Blanchard wrote: > On Tue, 2008-04-01 at 09:46 -0500, Anthony Liguori wrote: > > Thanks Anthony, you've saved me a lot of debug time! Rusty, doing 64-bit > PCI config space accesses with ioread8() definitely violates the > principle of least surprises, and would have taken me a long time to > track down. :( > > Attached is a boot log of a PowerPC guest booting from virtio-blk root. > > "ramdisk_image" is the standard ~4MB image provided with DENX Embedded > Linux Development Kit. Booting is also *way* faster than NFS root (a few > seconds to get to a shell :) . > That suggests you have vmexit latency issues. A 4MB disk is pretty much entirely cachable in memory so you probably end up with only a handful of requests to get the full disk into memory. Conversely, when using NFS, every single filesystem operation requests in multiple packets being delivered/received. To complicate matters further, NFS means you won't be doing any dentry caching so every single filesystem access will result in requests as opposed to just the first access. What sort of ping latency do you get with virtio-net? Regards, Anthony Liguori ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: booting from virtio-blk 2008-04-01 16:13 ` booting from virtio-blk Hollis Blanchard 2008-04-01 17:05 ` Anthony Liguori @ 2008-04-01 17:09 ` Anthony Liguori 2008-04-01 20:36 ` [kvm-ppc-devel] " Benjamin Herrenschmidt 1 sibling, 1 reply; 8+ messages in thread From: Anthony Liguori @ 2008-04-01 17:09 UTC (permalink / raw) To: Hollis Blanchard Cc: kvm-ppc-devel, Christian Ehrhardt, Rusty Russell, kvm-devel Hollis Blanchard wrote: > > Thanks Anthony, you've saved me a lot of debug time! Rusty, doing 64-bit > PCI config space accesses with ioread8() definitely violates the > principle of least surprises, and would have taken me a long time to > track down. :( > It's the unfortunate side-effect of using PCI config space without passing it's semantics through to the virtio devices. Right now, you do a config_get which is basically a memcpy. If we didn't do accesses with ioread8(), you could potentially have a caller than did a config_get() of size 4 that didn't intend on having endian conversion applied. The other option would have been to provide config_get() and config_get8/16/32/64() the later performing endian conversion. Regards, Anthony Liguori > Attached is a boot log of a PowerPC guest booting from virtio-blk root. > > "ramdisk_image" is the standard ~4MB image provided with DENX Embedded > Linux Development Kit. Booting is also *way* faster than NFS root (a few > seconds to get to a shell :) . > > ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [kvm-ppc-devel] booting from virtio-blk 2008-04-01 17:09 ` Anthony Liguori @ 2008-04-01 20:36 ` Benjamin Herrenschmidt 2008-04-01 21:03 ` Anthony Liguori 0 siblings, 1 reply; 8+ messages in thread From: Benjamin Herrenschmidt @ 2008-04-01 20:36 UTC (permalink / raw) To: Anthony Liguori; +Cc: kvm-ppc-devel, kvm-devel, Rusty Russell, Hollis Blanchard On Tue, 2008-04-01 at 12:09 -0500, Anthony Liguori wrote: > It's the unfortunate side-effect of using PCI config space without > passing it's semantics through to the virtio devices. Right now, you do > a config_get which is basically a memcpy. If we didn't do accesses with > ioread8(), you could potentially have a caller than did a config_get() > of size 4 that didn't intend on having endian conversion applied. > > The other option would have been to provide config_get() and > config_get8/16/32/64() the later performing endian conversion. Config space should be 8/16/32. Is that ever bridged to real PCI config space anyway ? Or only virtio ? And it should be endian swapped at the low level, either by your HV calls or by the low level kernel. Always. That's how PCI config space is supposed to work. Ben. ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [kvm-ppc-devel] booting from virtio-blk 2008-04-01 20:36 ` [kvm-ppc-devel] " Benjamin Herrenschmidt @ 2008-04-01 21:03 ` Anthony Liguori 2008-04-01 21:14 ` Benjamin Herrenschmidt 2008-04-01 21:18 ` Hollis Blanchard 0 siblings, 2 replies; 8+ messages in thread From: Anthony Liguori @ 2008-04-01 21:03 UTC (permalink / raw) To: benh; +Cc: kvm-ppc-devel, kvm-devel, Rusty Russell, Hollis Blanchard Benjamin Herrenschmidt wrote: > On Tue, 2008-04-01 at 12:09 -0500, Anthony Liguori wrote: > > >> It's the unfortunate side-effect of using PCI config space without >> passing it's semantics through to the virtio devices. Right now, you do >> a config_get which is basically a memcpy. If we didn't do accesses with >> ioread8(), you could potentially have a caller than did a config_get() >> of size 4 that didn't intend on having endian conversion applied. >> >> The other option would have been to provide config_get() and >> config_get8/16/32/64() the later performing endian conversion. >> > > Config space should be 8/16/32. Is that ever bridged to real PCI config > space anyway ? Or only virtio ? And it should be endian swapped at the > low level, either by your HV calls or by the low level kernel. Always. > That's how PCI config space is supposed to work. > I guess the point is, is that virtio config space is an abstraction with the implementation that is based on PCI converting all accesses to a series of 8-bit accesses. The virtio config space happens to be little endian just like the PCI config space. Regards, Anthony Liguori > Ben. > > ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [kvm-ppc-devel] booting from virtio-blk 2008-04-01 21:03 ` Anthony Liguori @ 2008-04-01 21:14 ` Benjamin Herrenschmidt 2008-04-01 21:18 ` Hollis Blanchard 1 sibling, 0 replies; 8+ messages in thread From: Benjamin Herrenschmidt @ 2008-04-01 21:14 UTC (permalink / raw) To: Anthony Liguori; +Cc: kvm-ppc-devel, kvm-devel, Rusty Russell, Hollis Blanchard > > Config space should be 8/16/32. Is that ever bridged to real PCI config > > space anyway ? Or only virtio ? And it should be endian swapped at the > > low level, either by your HV calls or by the low level kernel. Always. > > That's how PCI config space is supposed to work. > > > > I guess the point is, is that virtio config space is an abstraction with > the implementation that is based on PCI converting all accesses to a > series of 8-bit accesses. The virtio config space happens to be little > endian just like the PCI config space. But PCI does -not- convert all accesses into a serie of 8 bit accesses :-) Ben. ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [kvm-ppc-devel] booting from virtio-blk 2008-04-01 21:03 ` Anthony Liguori 2008-04-01 21:14 ` Benjamin Herrenschmidt @ 2008-04-01 21:18 ` Hollis Blanchard 2008-04-01 21:24 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 8+ messages in thread From: Hollis Blanchard @ 2008-04-01 21:18 UTC (permalink / raw) To: Anthony Liguori; +Cc: kvm-ppc-devel, benh, Rusty Russell, kvm-devel On Tue, 2008-04-01 at 16:03 -0500, Anthony Liguori wrote: > Benjamin Herrenschmidt wrote: > > On Tue, 2008-04-01 at 12:09 -0500, Anthony Liguori wrote: > > > > > >> It's the unfortunate side-effect of using PCI config space without > >> passing it's semantics through to the virtio devices. Right now, you do > >> a config_get which is basically a memcpy. If we didn't do accesses with > >> ioread8(), you could potentially have a caller than did a config_get() > >> of size 4 that didn't intend on having endian conversion applied. > >> > >> The other option would have been to provide config_get() and > >> config_get8/16/32/64() the later performing endian conversion. > >> > > > > Config space should be 8/16/32. Is that ever bridged to real PCI config > > space anyway ? Or only virtio ? And it should be endian swapped at the > > low level, either by your HV calls or by the low level kernel. Always. > > That's how PCI config space is supposed to work. Virtio accesses will not be bridged to real PCI space. > I guess the point is, is that virtio config space is an abstraction with > the implementation that is based on PCI converting all accesses to a > series of 8-bit accesses. The virtio config space happens to be little > endian just like the PCI config space. The point is that a virtio device appears as a PCI device. Like all other PCI devices, it has config space. Unlike all other PCI devices, its config space is accessed with 1-byte reads. -- Hollis Blanchard IBM Linux Technology Center ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [kvm-ppc-devel] booting from virtio-blk 2008-04-01 21:18 ` Hollis Blanchard @ 2008-04-01 21:24 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 8+ messages in thread From: Benjamin Herrenschmidt @ 2008-04-01 21:24 UTC (permalink / raw) To: Hollis Blanchard; +Cc: kvm-ppc-devel, kvm-devel, Rusty Russell On Tue, 2008-04-01 at 16:18 -0500, Hollis Blanchard wrote: > The point is that a virtio device appears as a PCI device. Like all > other PCI devices, it has config space. Unlike all other PCI devices, > its config space is accessed with 1-byte reads. Which is weirdo ... it you guys make it look like PCI, then -really- make it look like PCI :-) Ben. ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-04-01 21:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1207051275240-git-send-email-ehrhardt@linux.vnet.ibm.com>
[not found] ` <47F225D2.9020409@linux.vnet.ibm.com>
[not found] ` <1207060430.6214.12.camel@basalt>
[not found] ` <47F24ABA.3020404@us.ibm.com>
2008-04-01 16:13 ` booting from virtio-blk Hollis Blanchard
2008-04-01 17:05 ` Anthony Liguori
2008-04-01 17:09 ` Anthony Liguori
2008-04-01 20:36 ` [kvm-ppc-devel] " Benjamin Herrenschmidt
2008-04-01 21:03 ` Anthony Liguori
2008-04-01 21:14 ` Benjamin Herrenschmidt
2008-04-01 21:18 ` Hollis Blanchard
2008-04-01 21:24 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox