* [Qemu-devel] PCI access virtualization
@ 2006-01-05 10:14 Michael Renzmann
2006-01-05 14:25 ` Paul Brook
2006-01-05 16:25 ` Mark Williamson
0 siblings, 2 replies; 9+ messages in thread
From: Michael Renzmann @ 2006-01-05 10:14 UTC (permalink / raw)
To: qemu-devel
Hi all.
Disclaimer: I'm quite new to things like virtual machines, and even
half-baked when it comes to QEMU. I'm aware that my idea might be
absurd, but still it's worth a try :)
I'm wondering if it would be possible to modify QEMU such that VMs could
access PCI devices on the host system. And, if it would be possible at
all, how much work this would be.
Background: I'm one of the developers of MadWifi, which is a linux
driver for WLAN cards based on Atheros chipsets. QEMU could be a great
help in testing distribution-specific issues, as well as issues related
to SMP operation. The only downside is that it's not easily possible to
emulate the necessary WLAN hardware.
Now, if it would be possible to allow a VM to access physical PCI
devices on the host, it could make use of non-emulated hardware for this
purpose. Even more, thanks to the SMP emulation that recently went into
QEMU, it would be possible to test the compatibility of the driver with
SMP systems, even if the host is just a UP system.
I vaguely heard of a feature present in Xen, which allows to assign PCI
devices to one of the guests. I understand Xen works different than
QEMU, but maybe is would be possible to implement something similar.
Any comments appreciated, thanks in advance.
Bye, Mike
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] PCI access virtualization
2006-01-05 10:14 [Qemu-devel] PCI access virtualization Michael Renzmann
@ 2006-01-05 14:25 ` Paul Brook
2006-01-05 17:40 ` Mark Williamson
2006-01-05 16:25 ` Mark Williamson
1 sibling, 1 reply; 9+ messages in thread
From: Paul Brook @ 2006-01-05 14:25 UTC (permalink / raw)
To: qemu-devel, mrenzmann
> Background: I'm one of the developers of MadWifi, which is a linux
> driver for WLAN cards based on Atheros chipsets. QEMU could be a great
> help in testing distribution-specific issues, as well as issues related
> to SMP operation. The only downside is that it's not easily possible to
> emulate the necessary WLAN hardware.
> Now, if it would be possible to allow a VM to access physical PCI
> devices on the host, it could make use of non-emulated hardware for this
> purpose. Even more, thanks to the SMP emulation that recently went into
> QEMU, it would be possible to test the compatibility of the driver with
> SMP systems, even if the host is just a UP system.
>
There are two problems:
- IRQ sharing. Sharing host IRQs between native and virtualized devices is
hard because the host needs to ack the interrupt in the IRQ handler, but
doesn't really know how to do that until after it's run the guest to see what
that does.
- DMA. qemu needs to rewrite DMA requests (in both directions) because the
guest physical memory won't be at the same address on the host. Inlike ISA
where there's a fixed DMA engine, I don't think there's any general way of
There are patches that allow virtualization of PCI devices that don't use
either of the above features. It's sufficient to get some Network cards
working, but that's about it.
> I vaguely heard of a feature present in Xen, which allows to assign PCI
> devices to one of the guests. I understand Xen works different than
> QEMU, but maybe is would be possible to implement something similar.
Xen is much easier because it cooperates with the host system (ie. xen), so
both the above problems can be solved by tweaking the guest OS drivers/PCI
subsystem setup.
If you're testing specific drivers you could probably augment these drivers to
pass the extra required information to qemu. ie. effectively use a special
qemu pseudo-PCI interface rather than the normal piix PCI interface.
Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] PCI access virtualization
2006-01-05 14:25 ` Paul Brook
@ 2006-01-05 17:40 ` Mark Williamson
2006-01-05 18:10 ` Paul Brook
0 siblings, 1 reply; 9+ messages in thread
From: Mark Williamson @ 2006-01-05 17:40 UTC (permalink / raw)
To: qemu-devel; +Cc: Paul Brook
> - IRQ sharing. Sharing host IRQs between native and virtualized devices is
> hard because the host needs to ack the interrupt in the IRQ handler, but
> doesn't really know how to do that until after it's run the guest to see
> what that does.
Could maybe have the (inevitable) kernel portion of the code grab the
interrupt, and not ack it until userspace does an ioctl on a special file (or
something like that?). There are patches floating around for userspace IRQ
handling, so I guess that could work.
> - DMA. qemu needs to rewrite DMA requests (in both directions) because the
> guest physical memory won't be at the same address on the host. Inlike ISA
> where there's a fixed DMA engine, I don't think there's any general way of
I was under the impression that you could get reasonably far by emulating a
few of the most popular commercial DMA engine chips and reissuing
address-corrected commands to the host. I'm not sure how common it is for
PCI cards to use custom DMA chips instead, though...
> There are patches that allow virtualization of PCI devices that don't use
> either of the above features. It's sufficient to get some Network cards
> working, but that's about it.
I guess PIO should be easy to work in any case.
> > I vaguely heard of a feature present in Xen, which allows to assign PCI
> > devices to one of the guests. I understand Xen works different than
> > QEMU, but maybe is would be possible to implement something similar.
>
> Xen is much easier because it cooperates with the host system (ie. xen), so
> both the above problems can be solved by tweaking the guest OS drivers/PCI
> subsystem setup.
Yep, XenLinux redefines various macros that were already present to do
guest-physical <-> host-physical address translations, so DMA Just Works
(TM).
> If you're testing specific drivers you could probably augment these drivers
> to pass the extra required information to qemu. ie. effectively use a
> special qemu pseudo-PCI interface rather than the normal piix PCI
> interface.
How about something like this?:
I'd imagine you could get away with a special header file with different macro
defines (as for Xen, above), just in the driver in question, and a special
"translation device / service" available to the QEmu virtual machine - could
be as simple as "write the guest physical address to an IO port, returns the
real physical address on next read". The virt_to_bus (etc) macros would use
the translation service to perform the appropriate translation at runtime.
Cheers,
Mark
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] PCI access virtualization
2006-01-05 17:40 ` Mark Williamson
@ 2006-01-05 18:10 ` Paul Brook
2006-01-05 21:13 ` Jim C. Brown
2006-01-06 1:54 ` Mark Williamson
0 siblings, 2 replies; 9+ messages in thread
From: Paul Brook @ 2006-01-05 18:10 UTC (permalink / raw)
To: Mark Williamson; +Cc: qemu-devel
On Thursday 05 January 2006 17:40, Mark Williamson wrote:
> > - IRQ sharing. Sharing host IRQs between native and virtualized devices
> > is hard because the host needs to ack the interrupt in the IRQ handler,
> > but doesn't really know how to do that until after it's run the guest to
> > see what that does.
>
> Could maybe have the (inevitable) kernel portion of the code grab the
> interrupt, and not ack it until userspace does an ioctl on a special file
> (or something like that?). There are patches floating around for userspace
> IRQ handling, so I guess that could work.
This still requires cooperation from both sides (ie. both the host and guest
drivers).
> > - DMA. qemu needs to rewrite DMA requests (in both directions) because
> > the guest physical memory won't be at the same address on the host.
> > Inlike ISA where there's a fixed DMA engine, I don't think there's any
> > general way of
>
> I was under the impression that you could get reasonably far by emulating a
> few of the most popular commercial DMA engine chips and reissuing
> address-corrected commands to the host. I'm not sure how common it is for
> PCI cards to use custom DMA chips instead, though...
IIUC PCI cards don't really have "DMA engines" as such. The PCI bridge just
maps PCI address space onto physical memory. A Busmaster PCI device can then
make arbitrary acceses whenever it wants. I expect the default mapping is a
1:1 mapping of the first 4G of physical ram.
> > There are patches that allow virtualization of PCI devices that don't use
> > either of the above features. It's sufficient to get some Network cards
> > working, but that's about it.
>
> I guess PIO should be easy to work in any case.
Yes.
> > > I vaguely heard of a feature present in Xen, which allows to assign PCI
> > > devices to one of the guests. I understand Xen works different than
> > > QEMU, but maybe is would be possible to implement something similar.
> >
> > Xen is much easier because it cooperates with the host system (ie. xen),
> > so both the above problems can be solved by tweaking the guest OS
> > drivers/PCI subsystem setup.
>
> Yep, XenLinux redefines various macros that were already present to do
> guest-physical <-> host-physical address translations, so DMA Just Works
> (TM).
>
> > If you're testing specific drivers you could probably augment these
> > drivers to pass the extra required information to qemu. ie. effectively
> > use a special qemu pseudo-PCI interface rather than the normal piix PCI
> > interface.
>
> How about something like this?:
> I'd imagine you could get away with a special header file with different
> macro defines (as for Xen, above), just in the driver in question, and a
> special "translation device / service" available to the QEmu virtual
> machine - could be as simple as "write the guest physical address to an IO
> port, returns the real physical address on next read". The virt_to_bus
> (etc) macros would use the translation service to perform the appropriate
> translation at runtime.
That's exactly the sort of thing I meant. Ideally you'd just implement it as a
different type of PCI bridge, and everything would just work. I don't know if
linux supports such heterogeneous configurations though.
Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] PCI access virtualization
2006-01-05 18:10 ` Paul Brook
@ 2006-01-05 21:13 ` Jim C. Brown
2006-01-06 1:54 ` Mark Williamson
1 sibling, 0 replies; 9+ messages in thread
From: Jim C. Brown @ 2006-01-05 21:13 UTC (permalink / raw)
To: Paul Brook; +Cc: qemu-devel
On Thu, Jan 05, 2006 at 06:10:54PM +0000, Paul Brook wrote:
> On Thursday 05 January 2006 17:40, Mark Williamson wrote:
> > > - IRQ sharing. Sharing host IRQs between native and virtualized devices
> > > is hard because the host needs to ack the interrupt in the IRQ handler,
> > > but doesn't really know how to do that until after it's run the guest to
> > > see what that does.
> >
> > Could maybe have the (inevitable) kernel portion of the code grab the
> > interrupt, and not ack it until userspace does an ioctl on a special file
> > (or something like that?). There are patches floating around for userspace
> > IRQ handling, so I guess that could work.
>
> This still requires cooperation from both sides (ie. both the host and guest
> drivers).
>
This would be a lot easier if linux supported user-space device drivers..
and I believe work is being done in that area.
--
Infinite complexity begets infinite beauty.
Infinite precision begets infinite perfection.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] PCI access virtualization
2006-01-05 18:10 ` Paul Brook
2006-01-05 21:13 ` Jim C. Brown
@ 2006-01-06 1:54 ` Mark Williamson
2006-01-06 13:27 ` Paul Brook
1 sibling, 1 reply; 9+ messages in thread
From: Mark Williamson @ 2006-01-06 1:54 UTC (permalink / raw)
To: Paul Brook; +Cc: qemu-devel
> > Could maybe have the (inevitable) kernel portion of the code grab the
> > interrupt, and not ack it until userspace does an ioctl on a special file
> > (or something like that?). There are patches floating around for
> > userspace IRQ handling, so I guess that could work.
>
> This still requires cooperation from both sides (ie. both the host and
> guest drivers).
True.
> IIUC PCI cards don't really have "DMA engines" as such. The PCI bridge just
> maps PCI address space onto physical memory. A Busmaster PCI device can
> then make arbitrary acceses whenever it wants. I expect the default mapping
> is a 1:1 mapping of the first 4G of physical ram.
I was under the impression that the on-card bus-mastering engines were often
one of a few standard designs - this was suggested to me by someone more
knowledgeable than myself but must admit I don't have any hard evidence that
it's the case ;-)
A "half way" solution would, I guess, be to code up a partial emulator for the
device, emulating key operations (by translating them and reissuing to the
device) whilst allowing other operations to pass directly through.
> > How about something like this?:
> > I'd imagine you could get away with a special header file with different
> > macro defines (as for Xen, above), just in the driver in question, and a
> > special "translation device / service" available to the QEmu virtual
> > machine - could be as simple as "write the guest physical address to an
> > IO port, returns the real physical address on next read". The
> > virt_to_bus (etc) macros would use the translation service to perform the
> > appropriate translation at runtime.
>
> That's exactly the sort of thing I meant. Ideally you'd just implement it
> as a different type of PCI bridge, and everything would just work. I don't
> know if linux supports such heterogeneous configurations though.
I think that wouldn't be sufficient due to the way Linux behaves - virtual to
physical address conversion will be done within the guest driver and then
issued directly to the hardware, so a bridge driver won't be able to fix up
the DMA addressing correctly.
I guess with some header file hacks it'd be possible to make the conversion
macros themselves switchable, so that (in principle) any driver could be
loaded in "native" or "QEmu passthrough" mode by specifiying an optional
load-time argument - that would be cool. Making it automatically switch
would be a bit tricky though... maybe the macros could be made to switch on
the basis of the module PCI ID table :-D *evil grin*
Cheers,
Mark
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] PCI access virtualization
2006-01-06 1:54 ` Mark Williamson
@ 2006-01-06 13:27 ` Paul Brook
2006-01-06 14:23 ` Mark Williamson
0 siblings, 1 reply; 9+ messages in thread
From: Paul Brook @ 2006-01-06 13:27 UTC (permalink / raw)
To: Mark Williamson; +Cc: qemu-devel
> > IIUC PCI cards don't really have "DMA engines" as such. The PCI bridge
> > just maps PCI address space onto physical memory. A Busmaster PCI device
> > can then make arbitrary acceses whenever it wants. I expect the default
> > mapping is a 1:1 mapping of the first 4G of physical ram.
>
> I was under the impression that the on-card bus-mastering engines were
> often one of a few standard designs - this was suggested to me by someone
> more knowledgeable than myself but must admit I don't have any hard
> evidence that it's the case ;-)
What about 64-bit systems that use an IOMMU? Don't they already have a 64-bit
physical -> 32-bit IO address space mapping? I don't know if this mapping is
per-bus or system global.
Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] PCI access virtualization
2006-01-06 13:27 ` Paul Brook
@ 2006-01-06 14:23 ` Mark Williamson
0 siblings, 0 replies; 9+ messages in thread
From: Mark Williamson @ 2006-01-06 14:23 UTC (permalink / raw)
To: Paul Brook; +Cc: qemu-devel
> What about 64-bit systems that use an IOMMU? Don't they already have a
> 64-bit physical -> 32-bit IO address space mapping? I don't know if this
> mapping is per-bus or system global.
If it's an Intel x86_64 machine (no IOMMU yet), IIRC the 32-bit PCI devices
are capable of DMA-ing into the bottom of memory - there's no remapping. For
this reason I've seen various discussions about a ZONE_DMA_32 so that 32 bit
devices can request memory in this area...
If there's an IOMMU then the addressable 32 bits could of course map anywhere
in the host memory... For something like the Opteron, I imagine the IOMMU
mappings happen in the on-CPU memory controller, so I'd guess there'd be (at
most) one mapping per processor package. I've not actually looked into the
specifics of this, so take it with a pinch of salt ;-)
There's also a SoftIOMMU driver under Linux that fakes out IOMMU behaviour by
using bounce buffers. Actually, thinking about it I agree with you - it
*ought* to be possible to implement an IOMMU driver that does (at least some)
of what you'd want to do PCI passthrough... the Linux APIs do require the
kernel to be notified of DMA-able memory the driver intends to use.
Cheers,
Mark
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] PCI access virtualization
2006-01-05 10:14 [Qemu-devel] PCI access virtualization Michael Renzmann
2006-01-05 14:25 ` Paul Brook
@ 2006-01-05 16:25 ` Mark Williamson
1 sibling, 0 replies; 9+ messages in thread
From: Mark Williamson @ 2006-01-05 16:25 UTC (permalink / raw)
To: qemu-devel, mrenzmann
> I'm wondering if it would be possible to modify QEMU such that VMs could
> access PCI devices on the host system. And, if it would be possible at
> all, how much work this would be.
Sounds doable but would require a fair bit of hacking - I think the idea of
doing this under QEmu came up once before in the 2004/2005 timeframe.
For the record, Intel also implemented Emulation + PCI passthrough for their
SoftSDV project, but that also included extra hardware to allow the virtual
machine to control an entire PCI bus, not just one device. (the goal here
was to have a working IA64 "hardware" platform before IA64 silicon was
available).
> I vaguely heard of a feature present in Xen, which allows to assign PCI
> devices to one of the guests. I understand Xen works different than
> QEMU, but maybe is would be possible to implement something similar.
Yep, Xen 2.0 supports this. Support is currently broken in Xen 3.0 (because
PCI is handled differently) but will be coming back once somebody decides
what approach we should use and implements it.
Cheers,
Mark
>
> Any comments appreciated, thanks in advance.
>
> Bye, Mike
>
>
>
> _______________________________________________
> Qemu-devel mailing list
> Qemu-devel@nongnu.org
> http://lists.nongnu.org/mailman/listinfo/qemu-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-01-06 14:37 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-01-05 10:14 [Qemu-devel] PCI access virtualization Michael Renzmann
2006-01-05 14:25 ` Paul Brook
2006-01-05 17:40 ` Mark Williamson
2006-01-05 18:10 ` Paul Brook
2006-01-05 21:13 ` Jim C. Brown
2006-01-06 1:54 ` Mark Williamson
2006-01-06 13:27 ` Paul Brook
2006-01-06 14:23 ` Mark Williamson
2006-01-05 16:25 ` Mark Williamson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).