From: Gleb Natapov <gleb@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: kevin@koconnor.net, qemu-devel@nongnu.org
Subject: [Qemu-devel] Re: [PATCH 4/5] Make MMIO address page aligned in guest.
Date: Mon, 12 Oct 2009 10:48:58 +0200 [thread overview]
Message-ID: <20091012084858.GY16702@redhat.com> (raw)
In-Reply-To: <20091012081314.GB10741@redhat.com>
On Mon, Oct 12, 2009 at 10:13:14AM +0200, Michael S. Tsirkin wrote:
> > > > >
> > > > > This wastes memory for non-assigned devices. I think it's better, and
> > > > > cleaner, to make qemu increase the BAR size up to 4K for assigned
> > > > > devices if it wants page size alignment.
> > > > >
> > > > We have three and a half devices in QEUM so I don't think memory is a
> > > > big concern. Regardless, if you think that fiddle with assigned devices
> > > > responses is better idea go ahead and send patches.
> > >
> > > Even if you fiddle with BIOS, guest is allowed to reassign BARs,
> > > breaking your assumptions.
> > Good point. So the fact that this patched helped its creator shows that
> > linux doesn't do this.
>
> Try hot-plugging the device instead of have it present on boot.
> Patching BIOS won't help then, will it? So my question is, if we need
> to handle this in qemu, is it worth it to do it in kvm as well?
>
It depend how linux assign mmio address to hot pluggable devices. How
can you be sure a device driver continue working if you'll misrepresent
BAR size BTW?
> > > > As it stands this
> > > > patch is in kvm's bios and is required for assigned devices to work
> > > > for some devices, so moving to seabios without this patch will introduce
> > > > a regression.
> > >
> > > I have a question here: if kvm maps a full physical page
> > > into guest memory, while device only uses part of the page,
> > > won't that mean that guest is granted access outside the
> > > device, which it should not have?
> > And how is real HW different? It maps a full physical page into OS
> > memory even if BAR is smaller then page and grants OS access to
> > unassigned mmio region. Access unassigned mmio region shouldn't cause
> > any trouble, doesn't it?
>
> Unassigned - typically no, but there can be another device there, or a RAM
> page. It is different on real hardware where OS has access to all RAM and all
> devices, anyway.
>
> Here's an example from my laptop:
>
> 00:03.0 Communication controller: Intel Corporation Mobile 4 Series Chipset MEI Controller (rev 07)
> Subsystem: Lenovo Device 20e6
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx+
> Latency: 0
> Interrupt: pin A routed to IRQ 11
> Region 0: Memory at fc226800 (64-bit, non-prefetchable) [size=16]
> Capabilities: <access denied>
>
> ...
>
> 00:1f.2 SATA controller: Intel Corporation ICH9M/M-E SATA AHCI Controller (rev 03) (prog-if 01 [AHCI 1.0])
> Subsystem: Lenovo Device 20f8
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> Latency: 0
> Interrupt: pin B routed to IRQ 28
> Region 0: I/O ports at 1c48 [size=8]
> Region 1: I/O ports at 183c [size=4]
> Region 2: I/O ports at 1c40 [size=8]
> Region 3: I/O ports at 1838 [size=4]
> Region 4: I/O ports at 1c20 [size=32]
> Region 5: Memory at fc226000 (32-bit, non-prefetchable) [size=2K]
> Capabilities: <access denied>
> Kernel driver in use: ahci
>
> In this setup, if you assign a page at address fc226000, for SATA,
> I think that guest will be able to control Communication controller as well.
Who configures BARs for assigned device guest or host? If host you can't
safely passthrough one of those devices. But passthrough is not secure
anyway since guest can DMA all over host memory.
>
> > > Maybe the solution is to disable bypass for sub-page BARs and to
> > > handle them in qemu, where we don't have alignment restrictions?
> > >
> > Making fast path go through qemu for assigned devices? May be remove
> > this pass through crap from kvm to save us all from this misery then?
>
> Another option is for KVM to check these scenarious and deny assignment if
> there's such an overlap.
One more constrain for device assignment. Simple real life scenarios
don't work for our users as it is. Adding more constrains will not help.
>
> > > > >
> > > > > > ---
> > > > > > src/pciinit.c | 7 +++++++
> > > > > > 1 files changed, 7 insertions(+), 0 deletions(-)
> > > > > >
> > > > > > diff --git a/src/pciinit.c b/src/pciinit.c
> > > > > > index 29b3901..53fbfcf 100644
> > > > > > --- a/src/pciinit.c
> > > > > > +++ b/src/pciinit.c
> > > > > > @@ -10,6 +10,7 @@
> > > > > > #include "biosvar.h" // GET_EBDA
> > > > > > #include "pci_ids.h" // PCI_VENDOR_ID_INTEL
> > > > > > #include "pci_regs.h" // PCI_COMMAND
> > > > > > +#include "paravirt.h"
> > > > > >
> > > > > > #define PCI_ROM_SLOT 6
> > > > > > #define PCI_NUM_REGIONS 7
> > > > > > @@ -158,6 +159,12 @@ static void pci_bios_init_device(u16 bdf)
> > > > > > *paddr = ALIGN(*paddr, size);
> > > > > > pci_set_io_region_addr(bdf, i, *paddr);
> > > > > > *paddr += size;
> > > > > > + if (kvm_para_available()) {
> > > > > > + /* make memory address page aligned */
> > > > > > + /* needed for device assignment on kvm */
> > > > > > + if (!(val & PCI_BASE_ADDRESS_SPACE_IO))
> > > > > > + *paddr = (*paddr + 0xfff) & 0xfffff000;
> > > > > > + }
> > > > > > }
> > > > > > }
> > > > > > break;
> > > > > > --
> > > > > > 1.6.3.3
> > > > > >
> > > > > >
> > > >
> > > > --
> > > > Gleb.
> >
> > --
> > Gleb.
--
Gleb.
next prev parent reply other threads:[~2009-10-12 8:49 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-11 18:59 [Qemu-devel] [PATCH 1/5] Generate mptable unconditionally Gleb Natapov
2009-10-11 18:59 ` [Qemu-devel] [PATCH 2/5] Enable power button event generation Gleb Natapov
2009-10-11 18:59 ` [Qemu-devel] [PATCH 3/5] Use the correct mask to size the PCI option ROM BAR Gleb Natapov
2009-10-11 21:53 ` [Qemu-devel] " Michael S. Tsirkin
2009-10-12 6:50 ` Gleb Natapov
2009-10-12 9:52 ` Michael S. Tsirkin
2009-10-12 10:08 ` Gleb Natapov
2009-10-12 11:03 ` Michael S. Tsirkin
2009-10-12 11:45 ` Michael S. Tsirkin
2009-10-12 11:48 ` Gleb Natapov
2009-10-12 11:59 ` Michael S. Tsirkin
2009-10-12 12:08 ` Gleb Natapov
2009-10-12 13:20 ` Michael S. Tsirkin
2009-10-12 13:29 ` Gleb Natapov
2009-10-12 13:51 ` Michael S. Tsirkin
2009-10-12 14:04 ` Gleb Natapov
2009-10-12 14:11 ` Michael S. Tsirkin
2009-10-12 14:17 ` Gleb Natapov
2009-10-12 14:24 ` Michael S. Tsirkin
2009-10-12 14:20 ` [Qemu-devel] seabios: fix low bits in ROM and I/O sizing Michael S. Tsirkin
2009-10-13 13:39 ` [Qemu-devel] " Gleb Natapov
2009-10-14 23:29 ` Kevin O'Connor
2009-10-11 18:59 ` [Qemu-devel] [PATCH 4/5] Make MMIO address page aligned in guest Gleb Natapov
2009-10-11 21:48 ` [Qemu-devel] " Michael S. Tsirkin
2009-10-12 6:44 ` Gleb Natapov
2009-10-12 7:10 ` Michael S. Tsirkin
2009-10-12 7:22 ` Gleb Natapov
2009-10-12 8:13 ` Michael S. Tsirkin
2009-10-12 8:48 ` Gleb Natapov [this message]
2009-10-12 9:43 ` Michael S. Tsirkin
2009-10-12 10:06 ` Gleb Natapov
2009-10-12 14:27 ` Kevin O'Connor
2009-10-11 18:59 ` [Qemu-devel] [PATCH 5/5] Set the PCI base address to 0xf0000000 Gleb Natapov
2009-10-12 14:24 ` [Qemu-devel] " Kevin O'Connor
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091012084858.GY16702@redhat.com \
--to=gleb@redhat.com \
--cc=kevin@koconnor.net \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).