qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: "Benoît Canet" <benoit.canet@irqsave.net>
Cc: iommu@lists.linux-foundation.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] VFIO and scheduled SR-IOV cards
Date: Mon, 03 Jun 2013 12:02:09 -0600	[thread overview]
Message-ID: <1370282529.30975.344.camel@ul30vt.home> (raw)
In-Reply-To: <20130603163305.GC4094@irqsave.net>

On Mon, 2013-06-03 at 18:33 +0200, Benoît Canet wrote:
> Hello,
> 
> I plan to write a PF driver for an SR-IOV card and make the VFs work with QEMU's
> VFIO passthrough so I am asking the following design question before trying to
> write and push code.
> 
> After SR-IOV being enabled on this hardware only one VF function can be active
> at a given time.

Is this actually an SR-IOV device or are you trying to write a driver
that emulates SR-IOV for a PF?

> The PF host kernel driver is acting as a scheduler.
> It switch every few milliseconds which VF is the current active function while
> disabling the others VFs.
> 
> One consequence of how the hardware works is that the MMR regions of the
> switched off VFs must be unmapped and their io access should block until the VF
> is switched on again.

MMR = Memory Mapped Register?

This seems contradictory to the SR-IOV spec, which states:

        Each VF contains a non-shared set of physical resources required
        to deliver Function-specific
        services, e.g., resources such as work queues, data buffers,
        etc. These resources can be directly
        accessed by an SI without requiring VI or SR-PCIM intervention.

Furthermore, each VF should have a separate requester ID.  What's being
suggested here seems like maybe that's not the case.  If true, it would
make iommu groups challenging.  Is there any VF save/restore around the
scheduling?

> Each IOMMU map/unmap should be done in less than 100ns.

I think that may be a lot to ask if we need to unmap the regions in the
guest and in the iommu.  If the "VFs" used different requester IDs,
iommu unmapping whouldn't be necessary.  I experimented with switching
between trapped (read/write) access to memory regions and mmap'd (direct
mapping) for handling legacy interrupts.  There was a noticeable
performance penalty switching per interrupt.

> As the kernel iommu module is being called by the VFIO driver the PF driver
> cannot interface with it.
> 
> Currently the only interface of the VFIO code is for the userland QEMU process
> and I fear that notifying QEMU that it should do the unmap/block would take more
> than 100ns.
> 
> Also blocking the IO access in QEMU under the BQL would freeze QEMU.
> 
> Do you have and idea on how to write this required map and block/unmap feature ?

It seems like there are several options, but I'm doubtful that any of
them will meet 100ns.  If this is completely fake SR-IOV and there's not
a different requester ID per VF, I'd start with seeing if you can even
do the iommu_unmap/iommu_map of the MMIO BARs in under 100ns.  If that's
close to your limit, then your only real option for QEMU is to freeze
it, which still involves getting multiple (maybe many) vCPUs out of VM
mode.  That's not free either.  If by some miracle you have time to
spare, you could remap the regions to trapped mode and let the vCPUs run
while vfio blocks on read/write.

Maybe there's even a question whether mmap'd mode is worthwhile for this
device.  Trapping every read/write is orders of magnitude slower, but
allows you to handle the "wait for VF" on the kernel side.

If you can provide more info on the device design/contraints, maybe we
can come up with better options.  Thanks,

Alex

  reply	other threads:[~2013-06-03 18:02 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-03 16:33 [Qemu-devel] VFIO and scheduled SR-IOV cards Benoît Canet
2013-06-03 18:02 ` Alex Williamson [this message]
2013-06-03 18:34   ` Don Dutile
2013-06-03 18:57     ` Alex Williamson
2013-06-04 15:50       ` Benoît Canet
2013-06-04 18:31         ` Alex Williamson
2013-07-10 10:23         ` Michael S. Tsirkin
2013-07-28 15:17           ` Benoît Canet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1370282529.30975.344.camel@ul30vt.home \
    --to=alex.williamson@redhat.com \
    --cc=benoit.canet@irqsave.net \
    --cc=iommu@lists.linux-foundation.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).