qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] VFIO and scheduled SR-IOV cards
@ 2013-06-03 16:33 Benoît Canet
  2013-06-03 18:02 ` Alex Williamson
  0 siblings, 1 reply; 8+ messages in thread
From: Benoît Canet @ 2013-06-03 16:33 UTC (permalink / raw)
  To: iommu, qemu-devel, alex.williamson


Hello,

I plan to write a PF driver for an SR-IOV card and make the VFs work with QEMU's
VFIO passthrough so I am asking the following design question before trying to
write and push code.

After SR-IOV being enabled on this hardware only one VF function can be active
at a given time.

The PF host kernel driver is acting as a scheduler.
It switch every few milliseconds which VF is the current active function while
disabling the others VFs.

One consequence of how the hardware works is that the MMR regions of the
switched off VFs must be unmapped and their io access should block until the VF
is switched on again.

Each IOMMU map/unmap should be done in less than 100ns.

As the kernel iommu module is being called by the VFIO driver the PF driver
cannot interface with it.

Currently the only interface of the VFIO code is for the userland QEMU process
and I fear that notifying QEMU that it should do the unmap/block would take more
than 100ns.

Also blocking the IO access in QEMU under the BQL would freeze QEMU.

Do you have and idea on how to write this required map and block/unmap feature ?

Best regards

Benoît Canet

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] VFIO and scheduled SR-IOV cards
  2013-06-03 16:33 [Qemu-devel] VFIO and scheduled SR-IOV cards Benoît Canet
@ 2013-06-03 18:02 ` Alex Williamson
  2013-06-03 18:34   ` Don Dutile
  0 siblings, 1 reply; 8+ messages in thread
From: Alex Williamson @ 2013-06-03 18:02 UTC (permalink / raw)
  To: Benoît Canet; +Cc: iommu, qemu-devel

On Mon, 2013-06-03 at 18:33 +0200, Benoît Canet wrote:
> Hello,
> 
> I plan to write a PF driver for an SR-IOV card and make the VFs work with QEMU's
> VFIO passthrough so I am asking the following design question before trying to
> write and push code.
> 
> After SR-IOV being enabled on this hardware only one VF function can be active
> at a given time.

Is this actually an SR-IOV device or are you trying to write a driver
that emulates SR-IOV for a PF?

> The PF host kernel driver is acting as a scheduler.
> It switch every few milliseconds which VF is the current active function while
> disabling the others VFs.
> 
> One consequence of how the hardware works is that the MMR regions of the
> switched off VFs must be unmapped and their io access should block until the VF
> is switched on again.

MMR = Memory Mapped Register?

This seems contradictory to the SR-IOV spec, which states:

        Each VF contains a non-shared set of physical resources required
        to deliver Function-specific
        services, e.g., resources such as work queues, data buffers,
        etc. These resources can be directly
        accessed by an SI without requiring VI or SR-PCIM intervention.

Furthermore, each VF should have a separate requester ID.  What's being
suggested here seems like maybe that's not the case.  If true, it would
make iommu groups challenging.  Is there any VF save/restore around the
scheduling?

> Each IOMMU map/unmap should be done in less than 100ns.

I think that may be a lot to ask if we need to unmap the regions in the
guest and in the iommu.  If the "VFs" used different requester IDs,
iommu unmapping whouldn't be necessary.  I experimented with switching
between trapped (read/write) access to memory regions and mmap'd (direct
mapping) for handling legacy interrupts.  There was a noticeable
performance penalty switching per interrupt.

> As the kernel iommu module is being called by the VFIO driver the PF driver
> cannot interface with it.
> 
> Currently the only interface of the VFIO code is for the userland QEMU process
> and I fear that notifying QEMU that it should do the unmap/block would take more
> than 100ns.
> 
> Also blocking the IO access in QEMU under the BQL would freeze QEMU.
> 
> Do you have and idea on how to write this required map and block/unmap feature ?

It seems like there are several options, but I'm doubtful that any of
them will meet 100ns.  If this is completely fake SR-IOV and there's not
a different requester ID per VF, I'd start with seeing if you can even
do the iommu_unmap/iommu_map of the MMIO BARs in under 100ns.  If that's
close to your limit, then your only real option for QEMU is to freeze
it, which still involves getting multiple (maybe many) vCPUs out of VM
mode.  That's not free either.  If by some miracle you have time to
spare, you could remap the regions to trapped mode and let the vCPUs run
while vfio blocks on read/write.

Maybe there's even a question whether mmap'd mode is worthwhile for this
device.  Trapping every read/write is orders of magnitude slower, but
allows you to handle the "wait for VF" on the kernel side.

If you can provide more info on the device design/contraints, maybe we
can come up with better options.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] VFIO and scheduled SR-IOV cards
  2013-06-03 18:02 ` Alex Williamson
@ 2013-06-03 18:34   ` Don Dutile
  2013-06-03 18:57     ` Alex Williamson
  0 siblings, 1 reply; 8+ messages in thread
From: Don Dutile @ 2013-06-03 18:34 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Benoît Canet, iommu, qemu-devel

On 06/03/2013 02:02 PM, Alex Williamson wrote:
> On Mon, 2013-06-03 at 18:33 +0200, Benoît Canet wrote:
>> Hello,
>>
>> I plan to write a PF driver for an SR-IOV card and make the VFs work with QEMU's
>> VFIO passthrough so I am asking the following design question before trying to
>> write and push code.
>>
>> After SR-IOV being enabled on this hardware only one VF function can be active
>> at a given time.
>
> Is this actually an SR-IOV device or are you trying to write a driver
> that emulates SR-IOV for a PF?
>
>> The PF host kernel driver is acting as a scheduler.
>> It switch every few milliseconds which VF is the current active function while
>> disabling the others VFs.
>>
that's time-sharing of hw, which sw doesn't see ... so, ok.

>> One consequence of how the hardware works is that the MMR regions of the
>> switched off VFs must be unmapped and their io access should block until the VF
>> is switched on again.
>
This violates the spec., and does impact sw -- how can one assign such a VF to a guest
-- it does not work indep. of other VFs.

> MMR = Memory Mapped Register?
>
> This seems contradictory to the SR-IOV spec, which states:
>
>          Each VF contains a non-shared set of physical resources required
>          to deliver Function-specific
>          services, e.g., resources such as work queues, data buffers,
>          etc. These resources can be directly
>          accessed by an SI without requiring VI or SR-PCIM intervention.
>
> Furthermore, each VF should have a separate requester ID.  What's being
> suggested here seems like maybe that's not the case.  If true, it would
I didn't read it that way above.  I read it as the PCIe end is timeshared
btwn VFs (& PFs?). .... with some VFs disappearing (from a driver perspective)
as if the device was hot unplug w/o notification.  That will probably cause
read-timeouts & SME's, bringing down most enterprise-level systems.

> make iommu groups challenging.  Is there any VF save/restore around the
> scheduling?
>
>> Each IOMMU map/unmap should be done in less than 100ns.
>
> I think that may be a lot to ask if we need to unmap the regions in the
> guest and in the iommu.  If the "VFs" used different requester IDs,
> iommu unmapping whouldn't be necessary.  I experimented with switching
> between trapped (read/write) access to memory regions and mmap'd (direct
> mapping) for handling legacy interrupts.  There was a noticeable
> performance penalty switching per interrupt.
>
>> As the kernel iommu module is being called by the VFIO driver the PF driver
>> cannot interface with it.
>>
>> Currently the only interface of the VFIO code is for the userland QEMU process
>> and I fear that notifying QEMU that it should do the unmap/block would take more
>> than 100ns.
>>
>> Also blocking the IO access in QEMU under the BQL would freeze QEMU.
>>
>> Do you have and idea on how to write this required map and block/unmap feature ?
>
> It seems like there are several options, but I'm doubtful that any of
> them will meet 100ns.  If this is completely fake SR-IOV and there's not
> a different requester ID per VF, I'd start with seeing if you can even
> do the iommu_unmap/iommu_map of the MMIO BARs in under 100ns.  If that's
> close to your limit, then your only real option for QEMU is to freeze
> it, which still involves getting multiple (maybe many) vCPUs out of VM
> mode.  That's not free either.  If by some miracle you have time to
> spare, you could remap the regions to trapped mode and let the vCPUs run
> while vfio blocks on read/write.
>
> Maybe there's even a question whether mmap'd mode is worthwhile for this
> device.  Trapping every read/write is orders of magnitude slower, but
> allows you to handle the "wait for VF" on the kernel side.
>
> If you can provide more info on the device design/contraints, maybe we
> can come up with better options.  Thanks,
>
> Alex
>
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] VFIO and scheduled SR-IOV cards
  2013-06-03 18:34   ` Don Dutile
@ 2013-06-03 18:57     ` Alex Williamson
  2013-06-04 15:50       ` Benoît Canet
  0 siblings, 1 reply; 8+ messages in thread
From: Alex Williamson @ 2013-06-03 18:57 UTC (permalink / raw)
  To: Don Dutile; +Cc: Benoît Canet, iommu, qemu-devel

On Mon, 2013-06-03 at 14:34 -0400, Don Dutile wrote:
> On 06/03/2013 02:02 PM, Alex Williamson wrote:
> > On Mon, 2013-06-03 at 18:33 +0200, Benoît Canet wrote:
> >> Hello,
> >>
> >> I plan to write a PF driver for an SR-IOV card and make the VFs work with QEMU's
> >> VFIO passthrough so I am asking the following design question before trying to
> >> write and push code.
> >>
> >> After SR-IOV being enabled on this hardware only one VF function can be active
> >> at a given time.
> >
> > Is this actually an SR-IOV device or are you trying to write a driver
> > that emulates SR-IOV for a PF?
> >
> >> The PF host kernel driver is acting as a scheduler.
> >> It switch every few milliseconds which VF is the current active function while
> >> disabling the others VFs.
> >>
> that's time-sharing of hw, which sw doesn't see ... so, ok.
> 
> >> One consequence of how the hardware works is that the MMR regions of the
> >> switched off VFs must be unmapped and their io access should block until the VF
> >> is switched on again.
> >
> This violates the spec., and does impact sw -- how can one assign such a VF to a guest
> -- it does not work indep. of other VFs.
> 
> > MMR = Memory Mapped Register?
> >
> > This seems contradictory to the SR-IOV spec, which states:
> >
> >          Each VF contains a non-shared set of physical resources required
> >          to deliver Function-specific
> >          services, e.g., resources such as work queues, data buffers,
> >          etc. These resources can be directly
> >          accessed by an SI without requiring VI or SR-PCIM intervention.
> >
> > Furthermore, each VF should have a separate requester ID.  What's being
> > suggested here seems like maybe that's not the case.  If true, it would
> I didn't read it that way above.  I read it as the PCIe end is timeshared
> btwn VFs (& PFs?). .... with some VFs disappearing (from a driver perspective)
> as if the device was hot unplug w/o notification.  That will probably cause
> read-timeouts & SME's, bringing down most enterprise-level systems.

Perhaps I'm reading too much into it, but using the same requester ID
would seem like justification for why the device needs to be unmapped.
Otherwise we could just stop QEMU and leave the mappings alone if we
just want to make sure access to the device is blocked while the device
is swapped out.  Not the best overall throughput algorithm, but maybe a
proof of concept.  Need more info about how the device actually behaves
to know for sure.  Thanks,

Alex

> > make iommu groups challenging.  Is there any VF save/restore around the
> > scheduling?
> >
> >> Each IOMMU map/unmap should be done in less than 100ns.
> >
> > I think that may be a lot to ask if we need to unmap the regions in the
> > guest and in the iommu.  If the "VFs" used different requester IDs,
> > iommu unmapping whouldn't be necessary.  I experimented with switching
> > between trapped (read/write) access to memory regions and mmap'd (direct
> > mapping) for handling legacy interrupts.  There was a noticeable
> > performance penalty switching per interrupt.
> >
> >> As the kernel iommu module is being called by the VFIO driver the PF driver
> >> cannot interface with it.
> >>
> >> Currently the only interface of the VFIO code is for the userland QEMU process
> >> and I fear that notifying QEMU that it should do the unmap/block would take more
> >> than 100ns.
> >>
> >> Also blocking the IO access in QEMU under the BQL would freeze QEMU.
> >>
> >> Do you have and idea on how to write this required map and block/unmap feature ?
> >
> > It seems like there are several options, but I'm doubtful that any of
> > them will meet 100ns.  If this is completely fake SR-IOV and there's not
> > a different requester ID per VF, I'd start with seeing if you can even
> > do the iommu_unmap/iommu_map of the MMIO BARs in under 100ns.  If that's
> > close to your limit, then your only real option for QEMU is to freeze
> > it, which still involves getting multiple (maybe many) vCPUs out of VM
> > mode.  That's not free either.  If by some miracle you have time to
> > spare, you could remap the regions to trapped mode and let the vCPUs run
> > while vfio blocks on read/write.
> >
> > Maybe there's even a question whether mmap'd mode is worthwhile for this
> > device.  Trapping every read/write is orders of magnitude slower, but
> > allows you to handle the "wait for VF" on the kernel side.
> >
> > If you can provide more info on the device design/contraints, maybe we
> > can come up with better options.  Thanks,
> >
> > Alex
> >
> > _______________________________________________
> > iommu mailing list
> > iommu@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] VFIO and scheduled SR-IOV cards
  2013-06-03 18:57     ` Alex Williamson
@ 2013-06-04 15:50       ` Benoît Canet
  2013-06-04 18:31         ` Alex Williamson
  2013-07-10 10:23         ` Michael S. Tsirkin
  0 siblings, 2 replies; 8+ messages in thread
From: Benoît Canet @ 2013-06-04 15:50 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Benoît Canet, iommu, Don Dutile, qemu-devel


Hello,

More informations on how the hardware works.

-Each VF will have its own memory and MMR, etc.
That means the resources are not shared.

-Each VF will have its own bus number, function number and device number.
That means request ID is separated for each VF.

There is also VF save/restore area for the switch.

A VF regular memory (not MMR) is still accessible after a switch out.

But when a function VF1 is scheduled a read to a MRR of VF number 0 could return
the value of the same MMR in VF number 1 because VF number 1 is switched on and
the PF processor is busy servicing VF number 1.

This could confuse the guest VF driver so the unmap and block or a same goal
achieving technique is required.

I hope these informations makes the area of the problem to solve narrower.

Best regards

Benoît Canet

> Le Monday 03 Jun 2013 à 12:57:45 (-0600), Alex Williamson a écrit :
> On Mon, 2013-06-03 at 14:34 -0400, Don Dutile wrote:
> > On 06/03/2013 02:02 PM, Alex Williamson wrote:
> > > On Mon, 2013-06-03 at 18:33 +0200, Benoît Canet wrote:
> > >> Hello,
> > >>
> > >> I plan to write a PF driver for an SR-IOV card and make the VFs work with QEMU's
> > >> VFIO passthrough so I am asking the following design question before trying to
> > >> write and push code.
> > >>
> > >> After SR-IOV being enabled on this hardware only one VF function can be active
> > >> at a given time.
> > >
> > > Is this actually an SR-IOV device or are you trying to write a driver
> > > that emulates SR-IOV for a PF?
> > >
> > >> The PF host kernel driver is acting as a scheduler.
> > >> It switch every few milliseconds which VF is the current active function while
> > >> disabling the others VFs.
> > >>
> > that's time-sharing of hw, which sw doesn't see ... so, ok.
> > 
> > >> One consequence of how the hardware works is that the MMR regions of the
> > >> switched off VFs must be unmapped and their io access should block until the VF
> > >> is switched on again.
> > >
> > This violates the spec., and does impact sw -- how can one assign such a VF to a guest
> > -- it does not work indep. of other VFs.
> > 
> > > MMR = Memory Mapped Register?
> > >
> > > This seems contradictory to the SR-IOV spec, which states:
> > >
> > >          Each VF contains a non-shared set of physical resources required
> > >          to deliver Function-specific
> > >          services, e.g., resources such as work queues, data buffers,
> > >          etc. These resources can be directly
> > >          accessed by an SI without requiring VI or SR-PCIM intervention.
> > >
> > > Furthermore, each VF should have a separate requester ID.  What's being
> > > suggested here seems like maybe that's not the case.  If true, it would
> > I didn't read it that way above.  I read it as the PCIe end is timeshared
> > btwn VFs (& PFs?). .... with some VFs disappearing (from a driver perspective)
> > as if the device was hot unplug w/o notification.  That will probably cause
> > read-timeouts & SME's, bringing down most enterprise-level systems.
> 
> Perhaps I'm reading too much into it, but using the same requester ID
> would seem like justification for why the device needs to be unmapped.
> Otherwise we could just stop QEMU and leave the mappings alone if we
> just want to make sure access to the device is blocked while the device
> is swapped out.  Not the best overall throughput algorithm, but maybe a
> proof of concept.  Need more info about how the device actually behaves
> to know for sure.  Thanks,
> 
> Alex
> 
> > > make iommu groups challenging.  Is there any VF save/restore around the
> > > scheduling?
> > >
> > >> Each IOMMU map/unmap should be done in less than 100ns.
> > >
> > > I think that may be a lot to ask if we need to unmap the regions in the
> > > guest and in the iommu.  If the "VFs" used different requester IDs,
> > > iommu unmapping whouldn't be necessary.  I experimented with switching
> > > between trapped (read/write) access to memory regions and mmap'd (direct
> > > mapping) for handling legacy interrupts.  There was a noticeable
> > > performance penalty switching per interrupt.
> > >
> > >> As the kernel iommu module is being called by the VFIO driver the PF driver
> > >> cannot interface with it.
> > >>
> > >> Currently the only interface of the VFIO code is for the userland QEMU process
> > >> and I fear that notifying QEMU that it should do the unmap/block would take more
> > >> than 100ns.
> > >>
> > >> Also blocking the IO access in QEMU under the BQL would freeze QEMU.
> > >>
> > >> Do you have and idea on how to write this required map and block/unmap feature ?
> > >
> > > It seems like there are several options, but I'm doubtful that any of
> > > them will meet 100ns.  If this is completely fake SR-IOV and there's not
> > > a different requester ID per VF, I'd start with seeing if you can even
> > > do the iommu_unmap/iommu_map of the MMIO BARs in under 100ns.  If that's
> > > close to your limit, then your only real option for QEMU is to freeze
> > > it, which still involves getting multiple (maybe many) vCPUs out of VM
> > > mode.  That's not free either.  If by some miracle you have time to
> > > spare, you could remap the regions to trapped mode and let the vCPUs run
> > > while vfio blocks on read/write.
> > >
> > > Maybe there's even a question whether mmap'd mode is worthwhile for this
> > > device.  Trapping every read/write is orders of magnitude slower, but
> > > allows you to handle the "wait for VF" on the kernel side.
> > >
> > > If you can provide more info on the device design/contraints, maybe we
> > > can come up with better options.  Thanks,
> > >
> > > Alex
> > >
> > > _______________________________________________
> > > iommu mailing list
> > > iommu@lists.linux-foundation.org
> > > https://lists.linuxfoundation.org/mailman/listinfo/iommu
> > 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] VFIO and scheduled SR-IOV cards
  2013-06-04 15:50       ` Benoît Canet
@ 2013-06-04 18:31         ` Alex Williamson
  2013-07-10 10:23         ` Michael S. Tsirkin
  1 sibling, 0 replies; 8+ messages in thread
From: Alex Williamson @ 2013-06-04 18:31 UTC (permalink / raw)
  To: Benoît Canet; +Cc: iommu, Don Dutile, qemu-devel

On Tue, 2013-06-04 at 17:50 +0200, Benoît Canet wrote:
> Hello,
> 
> More informations on how the hardware works.
> 
> -Each VF will have its own memory and MMR, etc.
> That means the resources are not shared.

I'm still not clear on MMR, what is that?  Memory Mapped Registers (ie.
registers accessed through the device's MMIO regions)?

> -Each VF will have its own bus number, function number and device number.
> That means request ID is separated for each VF.

That's a relief :)

> There is also VF save/restore area for the switch.
> 
> A VF regular memory (not MMR) is still accessible after a switch out.

Does this mean that the MMIO of the device has some sections that are
memory mapped registers and some sections that are regular memory and
just the sections that are memory mapped registers are not accessible
when the VF is swapped out?  Are these within the same PCI BAR or split
across BARs?  What are the performance requirements of access to the MMR
regions (ie. do they even need to be mmap'd for direct access)?

> But when a function VF1 is scheduled a read to a MRR of VF number 0 could return
> the value of the same MMR in VF number 1 because VF number 1 is switched on and
> the PF processor is busy servicing VF number 1.
> 
> This could confuse the guest VF driver so the unmap and block or a same goal
> achieving technique is required.
> 
> I hope these informations makes the area of the problem to solve narrower.

Getting clearer.  Thanks,

Alex

> > Le Monday 03 Jun 2013 à 12:57:45 (-0600), Alex Williamson a écrit :
> > On Mon, 2013-06-03 at 14:34 -0400, Don Dutile wrote:
> > > On 06/03/2013 02:02 PM, Alex Williamson wrote:
> > > > On Mon, 2013-06-03 at 18:33 +0200, Benoît Canet wrote:
> > > >> Hello,
> > > >>
> > > >> I plan to write a PF driver for an SR-IOV card and make the VFs work with QEMU's
> > > >> VFIO passthrough so I am asking the following design question before trying to
> > > >> write and push code.
> > > >>
> > > >> After SR-IOV being enabled on this hardware only one VF function can be active
> > > >> at a given time.
> > > >
> > > > Is this actually an SR-IOV device or are you trying to write a driver
> > > > that emulates SR-IOV for a PF?
> > > >
> > > >> The PF host kernel driver is acting as a scheduler.
> > > >> It switch every few milliseconds which VF is the current active function while
> > > >> disabling the others VFs.
> > > >>
> > > that's time-sharing of hw, which sw doesn't see ... so, ok.
> > > 
> > > >> One consequence of how the hardware works is that the MMR regions of the
> > > >> switched off VFs must be unmapped and their io access should block until the VF
> > > >> is switched on again.
> > > >
> > > This violates the spec., and does impact sw -- how can one assign such a VF to a guest
> > > -- it does not work indep. of other VFs.
> > > 
> > > > MMR = Memory Mapped Register?
> > > >
> > > > This seems contradictory to the SR-IOV spec, which states:
> > > >
> > > >          Each VF contains a non-shared set of physical resources required
> > > >          to deliver Function-specific
> > > >          services, e.g., resources such as work queues, data buffers,
> > > >          etc. These resources can be directly
> > > >          accessed by an SI without requiring VI or SR-PCIM intervention.
> > > >
> > > > Furthermore, each VF should have a separate requester ID.  What's being
> > > > suggested here seems like maybe that's not the case.  If true, it would
> > > I didn't read it that way above.  I read it as the PCIe end is timeshared
> > > btwn VFs (& PFs?). .... with some VFs disappearing (from a driver perspective)
> > > as if the device was hot unplug w/o notification.  That will probably cause
> > > read-timeouts & SME's, bringing down most enterprise-level systems.
> > 
> > Perhaps I'm reading too much into it, but using the same requester ID
> > would seem like justification for why the device needs to be unmapped.
> > Otherwise we could just stop QEMU and leave the mappings alone if we
> > just want to make sure access to the device is blocked while the device
> > is swapped out.  Not the best overall throughput algorithm, but maybe a
> > proof of concept.  Need more info about how the device actually behaves
> > to know for sure.  Thanks,
> > 
> > Alex
> > 
> > > > make iommu groups challenging.  Is there any VF save/restore around the
> > > > scheduling?
> > > >
> > > >> Each IOMMU map/unmap should be done in less than 100ns.
> > > >
> > > > I think that may be a lot to ask if we need to unmap the regions in the
> > > > guest and in the iommu.  If the "VFs" used different requester IDs,
> > > > iommu unmapping whouldn't be necessary.  I experimented with switching
> > > > between trapped (read/write) access to memory regions and mmap'd (direct
> > > > mapping) for handling legacy interrupts.  There was a noticeable
> > > > performance penalty switching per interrupt.
> > > >
> > > >> As the kernel iommu module is being called by the VFIO driver the PF driver
> > > >> cannot interface with it.
> > > >>
> > > >> Currently the only interface of the VFIO code is for the userland QEMU process
> > > >> and I fear that notifying QEMU that it should do the unmap/block would take more
> > > >> than 100ns.
> > > >>
> > > >> Also blocking the IO access in QEMU under the BQL would freeze QEMU.
> > > >>
> > > >> Do you have and idea on how to write this required map and block/unmap feature ?
> > > >
> > > > It seems like there are several options, but I'm doubtful that any of
> > > > them will meet 100ns.  If this is completely fake SR-IOV and there's not
> > > > a different requester ID per VF, I'd start with seeing if you can even
> > > > do the iommu_unmap/iommu_map of the MMIO BARs in under 100ns.  If that's
> > > > close to your limit, then your only real option for QEMU is to freeze
> > > > it, which still involves getting multiple (maybe many) vCPUs out of VM
> > > > mode.  That's not free either.  If by some miracle you have time to
> > > > spare, you could remap the regions to trapped mode and let the vCPUs run
> > > > while vfio blocks on read/write.
> > > >
> > > > Maybe there's even a question whether mmap'd mode is worthwhile for this
> > > > device.  Trapping every read/write is orders of magnitude slower, but
> > > > allows you to handle the "wait for VF" on the kernel side.
> > > >
> > > > If you can provide more info on the device design/contraints, maybe we
> > > > can come up with better options.  Thanks,
> > > >
> > > > Alex
> > > >
> > > > _______________________________________________
> > > > iommu mailing list
> > > > iommu@lists.linux-foundation.org
> > > > https://lists.linuxfoundation.org/mailman/listinfo/iommu
> > > 
> > 
> > 
> > 
> > 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] VFIO and scheduled SR-IOV cards
  2013-06-04 15:50       ` Benoît Canet
  2013-06-04 18:31         ` Alex Williamson
@ 2013-07-10 10:23         ` Michael S. Tsirkin
  2013-07-28 15:17           ` Benoît Canet
  1 sibling, 1 reply; 8+ messages in thread
From: Michael S. Tsirkin @ 2013-07-10 10:23 UTC (permalink / raw)
  To: Benoît Canet; +Cc: Alex Williamson, Don Dutile, iommu, qemu-devel

On Tue, Jun 04, 2013 at 05:50:30PM +0200, Benoît Canet wrote:
> 
> Hello,
> 
> More informations on how the hardware works.
> 
> -Each VF will have its own memory and MMR, etc.
> That means the resources are not shared.
> 
> -Each VF will have its own bus number, function number and device number.
> That means request ID is separated for each VF.
> 
> There is also VF save/restore area for the switch.
> 
> A VF regular memory (not MMR) is still accessible after a switch out.
> 
> But when a function VF1 is scheduled a read to a MRR of VF number 0 could return
> the value of the same MMR in VF number 1 because VF number 1 is switched on and
> the PF processor is busy servicing VF number 1.
> 
> This could confuse the guest VF driver so the unmap and block or a same goal
> achieving technique is required.
> 
> I hope these informations makes the area of the problem to solve narrower.
> 
> Best regards
> 
> Benoît Canet

Confused.
You have one VF accessing BAR of another VF?
Why?

-- 
MST

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] VFIO and scheduled SR-IOV cards
  2013-07-10 10:23         ` Michael S. Tsirkin
@ 2013-07-28 15:17           ` Benoît Canet
  0 siblings, 0 replies; 8+ messages in thread
From: Benoît Canet @ 2013-07-28 15:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Benoît Canet, Alex Williamson, Don Dutile, iommu, qemu-devel

> Confused.
> You have one VF accessing BAR of another VF?
> Why?

The VFs are scheduled and the board have only one microcontroller responsible
to give the read results for some memory mapped registers.
So it could respond the value of the active VF when a bar of an inactive VF is
read.

I handed over the contract as it's seemed too tricky.

Best regards

Benoît

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-07-28 15:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-03 16:33 [Qemu-devel] VFIO and scheduled SR-IOV cards Benoît Canet
2013-06-03 18:02 ` Alex Williamson
2013-06-03 18:34   ` Don Dutile
2013-06-03 18:57     ` Alex Williamson
2013-06-04 15:50       ` Benoît Canet
2013-06-04 18:31         ` Alex Williamson
2013-07-10 10:23         ` Michael S. Tsirkin
2013-07-28 15:17           ` Benoît Canet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).