From: Anthony Liguori <anthony@codemonkey.ws>
To: Avi Kivity <avi@redhat.com>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: Re: [Qemu-devel] [RFC v4 00/58] Memory API
Date: Wed, 20 Jul 2011 09:31:48 -0500 [thread overview]
Message-ID: <4E26E6D4.6030401@codemonkey.ws> (raw)
In-Reply-To: <4E268D69.9010402@redhat.com>
On 07/20/2011 03:10 AM, Avi Kivity wrote:
> On 07/19/2011 11:51 PM, Anthony Liguori wrote:
>> On 07/19/2011 11:10 AM, Avi Kivity wrote:
>>> On 07/19/2011 07:05 PM, Avi Kivity wrote:
>>>> On 07/19/2011 05:50 PM, Anthony Liguori wrote:
>>>>>
>>>>>>>
>>>>>>> There's bits I don't like about the interface
>>>>>>
>>>>>> Which bits are these?
>>>>>
>>>>> Nothing I haven't already commented on. I think there's too much in
>>>>> the generic level. I don't think coalesced I/O belongs here. It's a
>>>>> concept that doesn't fit. I think a side-band API would be nicer.
>>>>
>>>> Well, it's impossible to do it in a side band. When a range that has
>>>> coalesced mmio is exposed is completely orthogonal to programming the
>>>> BAR register - it can happen, for example, due to another BAR being
>>>> removed or the bridge window being programmed. You can also have a
>>>> coalesced mmio region being partially clipped.
>>>
>>> Of course, it's not really impossible, just clumsy.
>>
>> There are exactly two devices that use coalesced I/O: VGA and e1000.
>>
>> VGA does coalesced I/O over the legacy VGA region (0xa0000 ...
>> 0xc0000). This region is very special in the PC and is directly routed
>> by the I440FX to the appropriate first PCI graphics card.
>>
>> The VGA device knows exactly where this region is mapped.
>
> The VGA device doesn't know *if* it is mapped. It can be obstructed by
> the chipset and by SMM. Other chipsets we emulate may support multiple
> VGA cards.
The i440fx can support multiple VGA cards just fine.
Legacy region accesses are always routed by the PCI bus to the first PCI
device that identifies itself as a graphics card.
The card is very well aware of the fact that it is getting legacy VGA
accesses or not because only one card can register for this area.
>> The e1000 does coalesced I/O for it's memory registers. But it's
>> dubious how much this actually matters anymore. The original claim was
>> a 10% boost with iperf.
>>
>> The e1000 is not performance competitive with virtio-net though so it
>> certainly is reasonable to assume that noone would notice if we
>> removed coalesced I/O from the e1000.
>
> The e1000 NIC is the best we have for guests that don't support virtio.
> It's not reasonable to reduce its performance.
So let's talk about real numbers. This is netperf with a default
invocation from guest to host. All numbers are MB/sec
rtl8139
-------
119.45
118.12
e1000 w/coalesced mmio
----------------------
425.93
424.08
e1000 w/o coalesced mmio
------------------------
419.13
413.83
virtio-net
----------
4330.52
4419.90
So removing coalesced MMIO from the e1000 results in a massive 0.7%
slowdown :-)
And while the e100 is > 100% faster than the rtl8139, it's still an
order of magnitude slower the userspace virtio-net.
I'm confident that the e1000 could be improved if someone modified it to
optimally use the new netdev interfaces. But no one cares that much
about the performance of the e1000. And if we dropped coalesced MMIO
support for the e1000, no one would notice.
Exits costs have changed dramatically over the years. Optimizations
that made sense with P4 class hardware don't necessary make sense these
days. QEMU has also changed a lot so bottle necks are no longer where
they used to be.
>> The point is, it's so incredibly special cased that having it as part
>> of such a general purpose API seems wrong. Of the hundreds of devices,
>> we only have one device that we know for sure really needs it and it
>> could easily be done independent of the memory API for that device.
>>
>
> We either support coalesced mmio well, or not at all. Even if the API
> has only one user, that doesn't excuse doing it badly.
It's not at all that black and white. We need to carefully choose what
we model and then have the flexibility to break those models in the name
of performance.
If we try to make everything fit elegantly into a model, we'll end up
with something that's overly complex just to accommodate a single user.
That's my general concern with where we're going here.
I don't think it's too bad and as I said, I don't object to it in it's
current form. But I think it could be simplified. Even in it's current
non-simple form, it's better than what we currently have.
Regards,
Anthony Liguori
next prev parent reply other threads:[~2011-07-20 14:31 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-17 11:13 [Qemu-devel] [RFC v4 00/58] Memory API Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 01/58] Hierarchical memory region API Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 02/58] memory: implement dirty tracking Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 03/58] memory: merge adjacent segments of a single memory region Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 04/58] Internal interfaces for memory API Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 05/58] memory: abstract address space operations Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 06/58] memory: rename MemoryRegion::has_ram_addr to ::terminates Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 07/58] memory: late initialization of ram_addr Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 08/58] memory: I/O address space support Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 09/58] memory: add backward compatibility for old portio registration Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 10/58] memory: add backward compatibility for old mmio registration Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 11/58] memory: add ioeventfd support Avi Kivity
2011-07-20 17:29 ` Blue Swirl
2011-07-20 17:31 ` Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 12/58] exec.c: initialize memory map Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 13/58] ioport: register ranges by byte aligned addresses always Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 14/58] pc: grab system_memory Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 15/58] pc: convert pc_memory_init() to memory API Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 16/58] pc: move global memory map out of pc_init1() and into its callers Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 17/58] pci: pass address space to pci bus when created Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 18/58] pci: add MemoryRegion based BAR management API Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 19/58] sysbus: add MemoryRegion based memory " Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 20/58] usb-ohci: convert to MemoryRegion Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 21/58] pci: add API to get a BAR's mapped address Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 22/58] vmsvga: don't remember pci BAR address in callback any more Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 23/58] vga: convert vga and its derivatives to the memory API Avi Kivity
2011-07-20 14:05 ` Jan Kiszka
2011-07-20 14:40 ` Avi Kivity
2011-07-20 14:45 ` Jan Kiszka
2011-07-20 14:48 ` Avi Kivity
2011-07-20 14:52 ` Jan Kiszka
2011-07-20 14:55 ` Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 24/58] cirrus: simplify mmio BAR access functions Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 25/58] cirrus: simplify bitblt " Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 26/58] cirrus: simplify vga window mmio " Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 27/58] vga: " Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 28/58] cirrus: simplify linear framebuffer " Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 29/58] Integrate I/O memory regions into qemu Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 30/58] exec.c: fix initialization of system I/O memory region Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 31/58] pci: pass I/O address space to new PCI bus Avi Kivity
2011-07-17 11:13 ` [Qemu-devel] [RFC v4 32/58] pci: allow I/O BARs to be registered with pci_register_bar_region() Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 33/58] rtl8139: convert to memory API Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 34/58] ac97: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 35/58] e1000: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 36/58] eepro100: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 37/58] es1370: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 38/58] ide: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 39/58] ivshmem: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 40/58] virtio-pci: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 41/58] ahci: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 42/58] intel-hda: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 43/58] lsi53c895a: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 44/58] ppc: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 45/58] ne2000: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 46/58] pcnet: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 47/58] i6300esb: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 48/58] isa-mmio: concert " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 49/58] sun4u: convert " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 50/58] ehci: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 51/58] uhci: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 52/58] xen-platform: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 53/58] msix: " Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 54/58] pci: remove pci_register_bar_simple() Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 55/58] pci: convert pci rom to memory API Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 56/58] pci: remove pci_register_bar() Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 57/58] pci: fold BAR mapping function into its caller Avi Kivity
2011-07-17 11:14 ` [Qemu-devel] [RFC v4 58/58] pci: rename pci_register_bar_region() to pci_register_bar() Avi Kivity
2011-07-19 13:09 ` [Qemu-devel] [RFC v4 00/58] Memory API Anthony Liguori
2011-07-19 13:27 ` Avi Kivity
2011-07-19 14:50 ` Anthony Liguori
2011-07-19 16:05 ` Avi Kivity
2011-07-19 16:10 ` Avi Kivity
2011-07-19 20:51 ` Anthony Liguori
2011-07-19 21:03 ` Sasha Levin
2011-07-20 2:53 ` Anthony Liguori
2011-07-20 6:10 ` Sasha Levin
2011-07-20 8:12 ` Avi Kivity
2011-07-20 12:12 ` Anthony Liguori
2011-07-20 8:10 ` Avi Kivity
2011-07-20 14:31 ` Anthony Liguori [this message]
2011-07-20 14:45 ` Avi Kivity
2011-07-19 13:56 ` Michael S. Tsirkin
2011-07-19 13:57 ` Avi Kivity
2011-07-19 17:01 ` Jan Kiszka
2011-07-19 17:14 ` Avi Kivity
2011-07-19 17:30 ` Jan Kiszka
2011-07-20 8:13 ` Avi Kivity
2011-07-20 11:20 ` Jan Kiszka
2011-07-20 11:59 ` Avi Kivity
2011-07-20 11:43 ` Jan Kiszka
2011-07-20 11:57 ` Avi Kivity
2011-07-20 13:57 ` Jan Kiszka
2011-07-20 14:32 ` Avi Kivity
2011-07-20 14:37 ` Michael S. Tsirkin
2011-07-20 14:42 ` Jan Kiszka
2011-07-20 14:54 ` Avi Kivity
2011-07-20 15:58 ` Jan Kiszka
2011-07-20 16:02 ` Avi Kivity
2011-07-20 16:13 ` Jan Kiszka
2011-07-20 16:19 ` Avi Kivity
2011-07-20 16:10 ` Michael S. Tsirkin
2011-07-20 16:52 ` Avi Kivity
2011-07-20 17:06 ` Michael S. Tsirkin
2011-07-20 16:53 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E26E6D4.6030401@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).