From: Hollis Blanchard <hollisb@us.ibm.com>
To: Anthony Liguori <aliguori@us.ibm.com>
Cc: kvm-ppc-devel@lists.sourceforge.net,
Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>,
Rusty Russell <rusty@ozlabs.au.ibm.com>,
kvm-devel <kvm-devel@lists.sourceforge.net>
Subject: Re: [kvm-ppc-devel] booting from virtio-blk
Date: Tue, 01 Apr 2008 16:13:52 +0000 [thread overview]
Message-ID: <1207066432.6214.29.camel@basalt> (raw)
In-Reply-To: <47F24ABA.3020404@us.ibm.com>
[-- Attachment #1: Type: text/plain, Size: 2579 bytes --]
On Tue, 2008-04-01 at 09:46 -0500, Anthony Liguori wrote:
> Hollis Blanchard wrote:
> > On Tue, 2008-04-01 at 14:08 +0200, Christian Ehrhardt wrote:
> >> bash-3.00# cat /proc/partitions
> >> major minor #blocks name
> >> [...]
> >> 254 0 22517998136852480 vda <- ?broken?
> >>
> >
> > My guess is this is run-of-the-mill endianness mismatch.
> > 22517998136852480 = 0x00500000_00000000, which 64-bit byteswapped would
> > be 0x5000, and that's probably a reasonable number of 512-byte blocks.
> > Is your disk image 10MB?
> >
> > Why would we have a problem, since both guest and host are big-endian?
> > Because virtio is a PCI device, and PCI MMIO are LE, so
> > __virtio_config_val() in the guest is (correctly) using le64_to_cpu().
> >
> > Why didn't we have problems with virtio-net? Because virtio-net doesn't
> > seem to have anything interesting in PCI config space. virtio-blk's
> > config space contains the capacity and a few other pieces of
> > information.
> >
> > The fix needs to be in qemu, and given the lack of qemu endianness
> > infrastructure, I'm afraid it will be a hack. See
> > http://svn.savannah.nongnu.org/viewvc/trunk/hw/e1000.c?root=qemu&r1=4046&r2=4045&pathrev=4046 for reference. We all know that TARGET_WORDS_BIGENDIAN is totally wrong, but unfortunately it also seems to be the only (accidentally) working solution in qemu without major IO system rework. :(
>
> It's actually not so bad since the virtio config space is already read
> one byte at a time. The following should help.
>
> diff --git a/qemu/hw/virtio-blk.c b/qemu/hw/virtio-blk.c
> index 0f55d2a..492bd7f 100644
> --- a/qemu/hw/virtio-blk.c
> +++ b/qemu/hw/virtio-blk.c
> @@ -134,8 +134,8 @@ static void virtio_blk_update_config(VirtIODevice
> *vdev, uin
> int64_t capacity;
>
> bdrv_get_geometry(s->bs, &capacity);
> - blkcfg.capacity = capacity;
> - blkcfg.seg_max = 128 - 2;
> + blkcfg.capacity = cpu_to_le64(capacity);
> + blkcfg.seg_max = cpu_to_le32(128 - 2);
> memcpy(config, &blkcfg, sizeof(blkcfg));
> }
Thanks Anthony, you've saved me a lot of debug time! Rusty, doing 64-bit
PCI config space accesses with ioread8() definitely violates the
principle of least surprises, and would have taken me a long time to
track down. :(
Attached is a boot log of a PowerPC guest booting from virtio-blk root.
"ramdisk_image" is the standard ~4MB image provided with DENX Embedded
Linux Development Kit. Booting is also *way* faster than NFS root (a few
seconds to get to a shell :) .
--
Hollis Blanchard
IBM Linux Technology Center
[-- Attachment #2: virtio-blk.log --]
[-- Type: text/x-log, Size: 5196 bytes --]
bash-3.00# ./qemu-system-ppcemb -M bamboo -nographic -kernel ../../uImage.bamboo -L ../pc-bios/ -append "root=/dev/vda rw debug" -net nic,model=virtio -net tap -drive file=/images/ramdisk_image,if=virtio,boot=on
bamboo_init: START
Ram size passed is: 144 MB
Calling function ppc440_init
setup mmio
setup universal controller
trying to setup sdram controller
sdram_unmap_bcr: Unmap RAM area 0000000000000000 00400000
sdram_unmap_bcr: Unmap RAM area 0000000000000000 00400000
sdram_set_bcr: Map RAM area 0000000000000000 08000000
sdram_set_bcr: Map RAM area 0000000000000000 01000000
Initializing first serial port
ppc405_serial_init: offset 0000000000000300
Done calling ppc440_init
bamboo_init: load kernel
kernel is at guest address: 0x0
bamboo_init: load device tree file
device tree address is at guest address: 0x2b2100
bamboo_init: loading kvm registers
bamboo_init: DONE
Using Bamboo machine description
Linux version 2.6.25-rc3-hg1858cec8eb87-dirty (hollisb@basalt) (gcc version 3.4.2) #152 Tue Apr 1 10:52:01 CDT 2008
Found legacy serial port 0 for /plb/opb/serial@ef600300
mem=ef600300, taddr=ef600300, irq=0, clk=11059200, speed=115200
Found legacy serial port 1 for /plb/opb/serial@ef600400
mem=ef600400, taddr=ef600400, irq=0, clk=11059200, speed=0
console [udbg0] enabled
Entering add_active_range(0, 0, 36864) 0 entries of 256 used
setup_arch: bootmem
arch: exit
Top of RAM: 0x9000000, Total RAM: 0x9000000
Memory hole size: 0MB
Zone PFN ranges:
DMA 0 -> 36864
Normal 36864 -> 36864
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
0: 0 -> 36864
On node 0 totalpages: 36864
DMA zone: 288 pages used for memmap
DMA zone: 0 pages reserved
DMA zone: 36576 pages, LIFO batch:7
Normal zone: 0 pages used for memmap
Movable zone: 0 pages used for memmap
Built 1 zonelists in Zone order, mobility grouping on. Total pages: 36576
Kernel command line: root=/dev/vda rw debug
irq: Allocated host of type 2 @0xc03f3880
UIC0 (32 IRQ sources) at DCR 0xc0
irq: Default host set to @0xc03f3880
PID hash table entries: 1024 (order: 10, 4096 bytes)
time_init: decrementer frequency = 666.666660 MHz
time_init: processor frequency = 666.666660 MHz
clocksource: timebase mult[600000] shift[22] registered
clockevent: decrementer mult[aaaa] shift[16] cpu[0]
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 143060k/147456k available (2632k kernel code, 4252k reserved, 100k data, 125k bss, 132k init)
SLUB: Genslabs=10, HWalign=32, Order=0-1, MinObjects=4, CPUs=1, Nodes=1
Calibrating delay loop... 2490.36 BogoMIPS (lpj=4980736)
Mount-cache hash table entries: 512
net_namespace: 156 bytes
NET: Registered protocol family 16
PCI host bridge /plb/pci@ec000000 (primary) ranges:
MEM 0x00000000a0000000..0x00000000bfffffff -> 0x00000000a0000000
IO 0x00000000e8000000..0x00000000e800ffff -> 0x0000000000000000
4xx PCI DMA offset set to 0x00000000
PCI: Probing PCI hardware
PCI: Hiding 4xx host bridge resources 0000:00:00.0
irq: irq_create_mapping(0xc03f3880, 0x1c)
irq: -> using host @c03f3880
irq: -> obtained virq 28
irq: irq_create_mapping(0xc03f3880, 0x1b)
irq: -> using host @c03f3880
irq: -> obtained virq 27
Time: timebase clocksource has been installed.
NET: Registered protocol family 2
IP route cache hash table entries: 2048 (order: 1, 8192 bytes)
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 8192 (order: 3, 32768 bytes)
TCP: Hash tables configured (established 8192 bind 8192)
TCP reno registered
irq: irq_create_mapping(0xc03f3880, 0x0)
irq: -> using host @c03f3880
irq: -> obtained virq 16
irq: irq_create_mapping(0xc03f3880, 0x1)
irq: -> using host @c03f3880
irq: -> obtained virq 17
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250.0: ttyS0 at MMIO 0xef600300 (irq = 16) is a 16450
console handover: boot [udbg0] -> real [ttyS0]
irq: irq_create_mapping(0xc03f3880, 0x0)
irq: -> using host @c03f3880
irq: -> existing mapping on virq 16
ef600300.serial: ttyS0 at MMIO 0xef600300 (irq = 16) is a 16450
irq: irq_create_mapping(0xc03f3880, 0x1)
irq: -> using host @c03f3880
irq: -> existing mapping on virq 17
brd: module loaded
Intel(R) PRO/1000 Network Driver - version 7.3.20-k2
Copyright (c) 1999-2006 Intel Corporation.
pcnet32.c:v1.34 14.Aug.2007 tsbogend@alpha.franken.de
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
PCI: Enabling device 0000:00:01.0 (0000 -> 0001)
vda: unknown partition table
PCI: Enabling device 0000:00:02.0 (0000 -> 0001)
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 132k init
root:~> ### Application running ...
root:~> ls
bin etc home linuxrc sbin usr
dev ftp lib proc tmp var
root:~>
[-- Attachment #3: Type: text/plain, Size: 278 bytes --]
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
[-- Attachment #4: Type: text/plain, Size: 170 bytes --]
_______________________________________________
kvm-ppc-devel mailing list
kvm-ppc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-ppc-devel
WARNING: multiple messages have this Message-ID (diff)
From: Hollis Blanchard <hollisb@us.ibm.com>
To: Anthony Liguori <aliguori@us.ibm.com>
Cc: kvm-ppc-devel@lists.sourceforge.net,
Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>,
Rusty Russell <rusty@ozlabs.au.ibm.com>,
kvm-devel <kvm-devel@lists.sourceforge.net>
Subject: Re: booting from virtio-blk
Date: Tue, 01 Apr 2008 11:13:52 -0500 [thread overview]
Message-ID: <1207066432.6214.29.camel@basalt> (raw)
In-Reply-To: <47F24ABA.3020404@us.ibm.com>
[-- Attachment #1: Type: text/plain, Size: 2579 bytes --]
On Tue, 2008-04-01 at 09:46 -0500, Anthony Liguori wrote:
> Hollis Blanchard wrote:
> > On Tue, 2008-04-01 at 14:08 +0200, Christian Ehrhardt wrote:
> >> bash-3.00# cat /proc/partitions
> >> major minor #blocks name
> >> [...]
> >> 254 0 22517998136852480 vda <- ?broken?
> >>
> >
> > My guess is this is run-of-the-mill endianness mismatch.
> > 22517998136852480 = 0x00500000_00000000, which 64-bit byteswapped would
> > be 0x5000, and that's probably a reasonable number of 512-byte blocks.
> > Is your disk image 10MB?
> >
> > Why would we have a problem, since both guest and host are big-endian?
> > Because virtio is a PCI device, and PCI MMIO are LE, so
> > __virtio_config_val() in the guest is (correctly) using le64_to_cpu().
> >
> > Why didn't we have problems with virtio-net? Because virtio-net doesn't
> > seem to have anything interesting in PCI config space. virtio-blk's
> > config space contains the capacity and a few other pieces of
> > information.
> >
> > The fix needs to be in qemu, and given the lack of qemu endianness
> > infrastructure, I'm afraid it will be a hack. See
> > http://svn.savannah.nongnu.org/viewvc/trunk/hw/e1000.c?root=qemu&r1=4046&r2=4045&pathrev=4046 for reference. We all know that TARGET_WORDS_BIGENDIAN is totally wrong, but unfortunately it also seems to be the only (accidentally) working solution in qemu without major IO system rework. :(
>
> It's actually not so bad since the virtio config space is already read
> one byte at a time. The following should help.
>
> diff --git a/qemu/hw/virtio-blk.c b/qemu/hw/virtio-blk.c
> index 0f55d2a..492bd7f 100644
> --- a/qemu/hw/virtio-blk.c
> +++ b/qemu/hw/virtio-blk.c
> @@ -134,8 +134,8 @@ static void virtio_blk_update_config(VirtIODevice
> *vdev, uin
> int64_t capacity;
>
> bdrv_get_geometry(s->bs, &capacity);
> - blkcfg.capacity = capacity;
> - blkcfg.seg_max = 128 - 2;
> + blkcfg.capacity = cpu_to_le64(capacity);
> + blkcfg.seg_max = cpu_to_le32(128 - 2);
> memcpy(config, &blkcfg, sizeof(blkcfg));
> }
Thanks Anthony, you've saved me a lot of debug time! Rusty, doing 64-bit
PCI config space accesses with ioread8() definitely violates the
principle of least surprises, and would have taken me a long time to
track down. :(
Attached is a boot log of a PowerPC guest booting from virtio-blk root.
"ramdisk_image" is the standard ~4MB image provided with DENX Embedded
Linux Development Kit. Booting is also *way* faster than NFS root (a few
seconds to get to a shell :) .
--
Hollis Blanchard
IBM Linux Technology Center
[-- Attachment #2: virtio-blk.log --]
[-- Type: text/x-log, Size: 5196 bytes --]
bash-3.00# ./qemu-system-ppcemb -M bamboo -nographic -kernel ../../uImage.bamboo -L ../pc-bios/ -append "root=/dev/vda rw debug" -net nic,model=virtio -net tap -drive file=/images/ramdisk_image,if=virtio,boot=on
bamboo_init: START
Ram size passed is: 144 MB
Calling function ppc440_init
setup mmio
setup universal controller
trying to setup sdram controller
sdram_unmap_bcr: Unmap RAM area 0000000000000000 00400000
sdram_unmap_bcr: Unmap RAM area 0000000000000000 00400000
sdram_set_bcr: Map RAM area 0000000000000000 08000000
sdram_set_bcr: Map RAM area 0000000000000000 01000000
Initializing first serial port
ppc405_serial_init: offset 0000000000000300
Done calling ppc440_init
bamboo_init: load kernel
kernel is at guest address: 0x0
bamboo_init: load device tree file
device tree address is at guest address: 0x2b2100
bamboo_init: loading kvm registers
bamboo_init: DONE
Using Bamboo machine description
Linux version 2.6.25-rc3-hg1858cec8eb87-dirty (hollisb@basalt) (gcc version 3.4.2) #152 Tue Apr 1 10:52:01 CDT 2008
Found legacy serial port 0 for /plb/opb/serial@ef600300
mem=ef600300, taddr=ef600300, irq=0, clk=11059200, speed=115200
Found legacy serial port 1 for /plb/opb/serial@ef600400
mem=ef600400, taddr=ef600400, irq=0, clk=11059200, speed=0
console [udbg0] enabled
Entering add_active_range(0, 0, 36864) 0 entries of 256 used
setup_arch: bootmem
arch: exit
Top of RAM: 0x9000000, Total RAM: 0x9000000
Memory hole size: 0MB
Zone PFN ranges:
DMA 0 -> 36864
Normal 36864 -> 36864
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
0: 0 -> 36864
On node 0 totalpages: 36864
DMA zone: 288 pages used for memmap
DMA zone: 0 pages reserved
DMA zone: 36576 pages, LIFO batch:7
Normal zone: 0 pages used for memmap
Movable zone: 0 pages used for memmap
Built 1 zonelists in Zone order, mobility grouping on. Total pages: 36576
Kernel command line: root=/dev/vda rw debug
irq: Allocated host of type 2 @0xc03f3880
UIC0 (32 IRQ sources) at DCR 0xc0
irq: Default host set to @0xc03f3880
PID hash table entries: 1024 (order: 10, 4096 bytes)
time_init: decrementer frequency = 666.666660 MHz
time_init: processor frequency = 666.666660 MHz
clocksource: timebase mult[600000] shift[22] registered
clockevent: decrementer mult[aaaa] shift[16] cpu[0]
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 143060k/147456k available (2632k kernel code, 4252k reserved, 100k data, 125k bss, 132k init)
SLUB: Genslabs=10, HWalign=32, Order=0-1, MinObjects=4, CPUs=1, Nodes=1
Calibrating delay loop... 2490.36 BogoMIPS (lpj=4980736)
Mount-cache hash table entries: 512
net_namespace: 156 bytes
NET: Registered protocol family 16
PCI host bridge /plb/pci@ec000000 (primary) ranges:
MEM 0x00000000a0000000..0x00000000bfffffff -> 0x00000000a0000000
IO 0x00000000e8000000..0x00000000e800ffff -> 0x0000000000000000
4xx PCI DMA offset set to 0x00000000
PCI: Probing PCI hardware
PCI: Hiding 4xx host bridge resources 0000:00:00.0
irq: irq_create_mapping(0xc03f3880, 0x1c)
irq: -> using host @c03f3880
irq: -> obtained virq 28
irq: irq_create_mapping(0xc03f3880, 0x1b)
irq: -> using host @c03f3880
irq: -> obtained virq 27
Time: timebase clocksource has been installed.
NET: Registered protocol family 2
IP route cache hash table entries: 2048 (order: 1, 8192 bytes)
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 8192 (order: 3, 32768 bytes)
TCP: Hash tables configured (established 8192 bind 8192)
TCP reno registered
irq: irq_create_mapping(0xc03f3880, 0x0)
irq: -> using host @c03f3880
irq: -> obtained virq 16
irq: irq_create_mapping(0xc03f3880, 0x1)
irq: -> using host @c03f3880
irq: -> obtained virq 17
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250.0: ttyS0 at MMIO 0xef600300 (irq = 16) is a 16450
console handover: boot [udbg0] -> real [ttyS0]
irq: irq_create_mapping(0xc03f3880, 0x0)
irq: -> using host @c03f3880
irq: -> existing mapping on virq 16
ef600300.serial: ttyS0 at MMIO 0xef600300 (irq = 16) is a 16450
irq: irq_create_mapping(0xc03f3880, 0x1)
irq: -> using host @c03f3880
irq: -> existing mapping on virq 17
brd: module loaded
Intel(R) PRO/1000 Network Driver - version 7.3.20-k2
Copyright (c) 1999-2006 Intel Corporation.
pcnet32.c:v1.34 14.Aug.2007 tsbogend@alpha.franken.de
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
PCI: Enabling device 0000:00:01.0 (0000 -> 0001)
vda: unknown partition table
PCI: Enabling device 0000:00:02.0 (0000 -> 0001)
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
VFS: Mounted root (ext2 filesystem).
Freeing unused kernel memory: 132k init
root:~> ### Application running ...
root:~> ls
bin etc home linuxrc sbin usr
dev ftp lib proc tmp var
root:~>
[-- Attachment #3: Type: text/plain, Size: 278 bytes --]
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
[-- Attachment #4: Type: text/plain, Size: 158 bytes --]
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel
next prev parent reply other threads:[~2008-04-01 16:13 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-01 12:01 [kvm-ppc-devel] [PATCH] kvm(ppc)-userspace: initialize virtio-block ehrhardt
2008-04-01 12:08 ` [kvm-ppc-devel] [PATCH] kvm(ppc)-userspace: initialize Christian Ehrhardt
2008-04-01 14:33 ` [kvm-ppc-devel] [PATCH] kvm(ppc)-userspace: Hollis Blanchard
2008-04-01 14:46 ` [kvm-ppc-devel] [PATCH] kvm(ppc)-userspace: initialize Anthony Liguori
2008-04-01 16:13 ` Hollis Blanchard [this message]
2008-04-01 16:13 ` booting from virtio-blk Hollis Blanchard
2008-04-01 17:05 ` [kvm-ppc-devel] " Anthony Liguori
2008-04-01 17:05 ` Anthony Liguori
2008-04-01 17:09 ` [kvm-ppc-devel] " Anthony Liguori
2008-04-01 17:09 ` Anthony Liguori
2008-04-01 20:36 ` [kvm-ppc-devel] " Benjamin Herrenschmidt
2008-04-01 20:36 ` Benjamin Herrenschmidt
2008-04-01 21:03 ` Anthony Liguori
2008-04-01 21:03 ` Anthony Liguori
2008-04-01 21:14 ` Benjamin Herrenschmidt
2008-04-01 21:14 ` Benjamin Herrenschmidt
2008-04-01 21:18 ` Hollis Blanchard
2008-04-01 21:18 ` Hollis Blanchard
2008-04-01 21:24 ` Benjamin Herrenschmidt
2008-04-01 21:24 ` Benjamin Herrenschmidt
2008-04-02 14:52 ` [kvm-ppc-devel] [PATCH] kvm(ppc)-userspace: initialize Anthony Liguori
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1207066432.6214.29.camel@basalt \
--to=hollisb@us.ibm.com \
--cc=aliguori@us.ibm.com \
--cc=ehrhardt@linux.vnet.ibm.com \
--cc=kvm-devel@lists.sourceforge.net \
--cc=kvm-ppc-devel@lists.sourceforge.net \
--cc=rusty@ozlabs.au.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.