qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: "Benoît Canet" <benoit.canet@irqsave.net>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Guest IOMMU and Cisco usnic
Date: Wed, 12 Feb 2014 12:34:25 -0700	[thread overview]
Message-ID: <1392233665.15608.299.camel@ul30vt.home> (raw)
In-Reply-To: <20140212181027.GB4225@irqsave.net>

On Wed, 2014-02-12 at 19:10 +0100, Benoît Canet wrote:
> Hi Alex,
> 
> After the IRC conversation we had a few days ago I understood that guest IOMMU
> was not implemented.
> 
> I have a real use case for it:
> 
> Cisco usnic allow to write MPI applications while driving the network card in
> userspace in order to optimize the latency. It's made for compute clusters.
> 
> The typical cloud provider don't provide bare metal access but only vms on top
> of Cisco's hardware hence VFIO is using the IOMMU to passthrough the NIC to the
> guest and no IOMMU is present in the guest.
> 
> questions: Would writing a performing guest IOMMU implementation be possible ?
>            How complex this project looks for someone knowing IOMMUs issues ?
> 
> The ideal implementation would forward the IOMMU work to the host hardware for
> speed.
> 
> I can devote time writing the feature if it's doable.

Hi Benoît,

I imagine it's doable, but it's certainly not trivial, beyond that I
haven't put much thought into it.

VFIO running in a guest would need an IOMMU that implements both the
IOMMU API and IOMMU groups.  Whether that comes from an emulated
physical IOMMU (like VT-d) or from a new paravirt IOMMU would be for you
to decide.  VT-d would imply using a PCIe chipset like Q35 and trying to
bandage on VT-d or updating Q35 to something that natively supports
VT-d.  Getting a sufficiently similar PCIe hierarchy between host an
guest would also be required.

The current model of putting all guest devices in a single IOMMU domain
on the host is likely not what you would want and might imply a new VFIO
IOMMU backend that is better tuned for separate domains, sparse
mappings, and low-latency.  VFIO has a modular IOMMU design, so this
isn't architecturally a problem.  The VFIO user (QEMU) is able to select
which backend to use and the code is written with supporting multiple
backends in mind.

A complication you'll have is that the granularity of IOMMU operations
through VFIO is at the IOMMU group level, so the guest would not be able
to easily split devices grouped together on the host between separate
users in the guest.  That could be modeled as a conventional PCI bridge
masking the requester ID of devices in the guest such that host groups
are mirrored as guest groups.

There might also be more simple "punch-through" ways to do it, for
instance what if instead of trying to make it work like it does on the
host we invented a paravirt VFIO interface and the vfio-pv driver in the
guest populated /dev/vfio as slightly modified passthroughs to the host
fds.  The guest OS may not even really need to be aware of the device.

It's an interesting project and certainly a valid use case.  I'd also
like to see things like Intel's DPDK move to using VFIO, but the current
UIO DPDK is often used in guests.  Thanks,

Alex

  reply	other threads:[~2014-02-12 19:34 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-12 18:10 [Qemu-devel] Guest IOMMU and Cisco usnic Benoît Canet
2014-02-12 19:34 ` Alex Williamson [this message]
2014-02-12 22:38   ` Benoît Canet
2014-02-12 22:51   ` Benoît Canet
2014-02-13  0:03     ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1392233665.15608.299.camel@ul30vt.home \
    --to=alex.williamson@redhat.com \
    --cc=benoit.canet@irqsave.net \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).