* FreeBSD guest with VTD NIC not passing traffic
@ 2011-12-19 14:19 Shashidhar Patil
2012-01-04 3:21 ` Alex Williamson
0 siblings, 1 reply; 11+ messages in thread
From: Shashidhar Patil @ 2011-12-19 14:19 UTC (permalink / raw)
To: kvm
Hi,
I am running Ubuntu 10.10 (amd64) on a 2 socket nehalem based
server with IOH 5520. 5520 supports VTD.
I enabled DMAR with intel_iommu=on. The box has intel 82599 adapter
which I assigned through VT-D to FreeBSD 8.2 running
as guest os. The ixgbe driver detects the device and the driver
successfully configures the device. But the link
never comes up. It looks like link up/down interrupts are not
delivered. Then I checked kvm interrupt assignment and as expected
kvm could not make MSI-X entries for the VT-d guest. So no output from
"grep kvm /proc/interrupt". By enabling some debugs in the
qemu-kvm I figured out that the MSI-x updates are not received
properly. It does look like Linux updates MSI-X table in a batch
fashion
which qemu-kvm gets in one shot and every thing works fine in case of
linux. In case of FreeBSD PCIE updates come /MSI-X entry
which qemu-kvm can't make use.
To overcome this I compiled latest intel ixgbe driver with MSI-X
diabled. This time MSI interrupt got allocated both in the guest
and in the qemu-kvm (host kernel). Still I could not get link UP. I
modified the FreeBSD driver to poll for Link in some local_timer
task. Link comes up but no traffic flows. The MAC statistics show
packets received but packets do not reach the guest.
DMAing of packet may be failing. I could not find out the reason.
The same happens with legacy interrupt allocated in the guest. I even
tried qemu-kvm prefer_msi=off.
I think there are two problems1
1. Interrupt delivery either because of interrupt remapping failure or
all out interrupt allocation failure in qemu-kvm
2. Packets not getting DMAed to the guest possibly because of some DMAR issue.
Before doing VT-d I made sure connections are fine and the adapter
works fine in bare metal Linux.
I also tried VT-d of 82599 with Linux as guest (both 32 bits(PAE and
non-PAE) and amd64) and it just works magically.
Is this a known issue I am hitting with non-Linux guest oses and VT-d
? Appreciate any help in debugging this problem.
-Shashidhar
Linux kernel - 2.6.35
kvm - 0.14.1
CPU - Nehalem
IO chipset - 5520
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: FreeBSD guest with VTD NIC not passing traffic
2011-12-19 14:19 FreeBSD guest with VTD NIC not passing traffic Shashidhar Patil
@ 2012-01-04 3:21 ` Alex Williamson
[not found] ` <CADve3d6aTAEK8FqjxVuRnLGu+Efy1rmsb0n5H5rq1G0Eu1s6PA@mail.gmail.com>
2012-01-13 21:00 ` Jan Kiszka
0 siblings, 2 replies; 11+ messages in thread
From: Alex Williamson @ 2012-01-04 3:21 UTC (permalink / raw)
To: Shashidhar Patil; +Cc: kvm
On Mon, 2011-12-19 at 19:49 +0530, Shashidhar Patil wrote:
> Hi,
> I am running Ubuntu 10.10 (amd64) on a 2 socket nehalem based
> server with IOH 5520. 5520 supports VTD.
> I enabled DMAR with intel_iommu=on. The box has intel 82599 adapter
> which I assigned through VT-D to FreeBSD 8.2 running
> as guest os. The ixgbe driver detects the device and the driver
> successfully configures the device. But the link
> never comes up. It looks like link up/down interrupts are not
> delivered. Then I checked kvm interrupt assignment and as expected
> kvm could not make MSI-X entries for the VT-d guest. So no output from
> "grep kvm /proc/interrupt". By enabling some debugs in the
> qemu-kvm I figured out that the MSI-x updates are not received
> properly. It does look like Linux updates MSI-X table in a batch
> fashion
> which qemu-kvm gets in one shot and every thing works fine in case of
> linux. In case of FreeBSD PCIE updates come /MSI-X entry
> which qemu-kvm can't make use.
That's right, Linux and Windows both seem to setup the MSI-X table then
enable it in one shot, so we only trigger the interrupt programming when
the enable bit is set. We don't trigger changes on writes to the MSI-X
table... not very accurate emulation of mask bits.
> To overcome this I compiled latest intel ixgbe driver with MSI-X
> diabled. This time MSI interrupt got allocated both in the guest
> and in the qemu-kvm (host kernel). Still I could not get link UP. I
> modified the FreeBSD driver to poll for Link in some local_timer
> task. Link comes up but no traffic flows. The MAC statistics show
> packets received but packets do not reach the guest.
> DMAing of packet may be failing. I could not find out the reason.
> The same happens with legacy interrupt allocated in the guest. I even
> tried qemu-kvm prefer_msi=off.
>
> I think there are two problems1
> 1. Interrupt delivery either because of interrupt remapping failure or
> all out interrupt allocation failure in qemu-kvm
> 2. Packets not getting DMAed to the guest possibly because of some DMAR issue.
>
> Before doing VT-d I made sure connections are fine and the adapter
> works fine in bare metal Linux.
> I also tried VT-d of 82599 with Linux as guest (both 32 bits(PAE and
> non-PAE) and amd64) and it just works magically.
Unknowns: does the intel ixgbe driver work in FreeBSD with MSI-X
disabled, does polling for link UP work. The way you describe MSI-X
interrupt setup for FreeBSD is likely something we don't handle
correctly. You might do a hack in device-assignment.c to call
assigned_dev_update_msix from msix_mmio_writel any time we write to the
vector control dword. We really have no support for either function
mask or per vector mask right now. I doubt you're having DMAR issues or
you'd likely be seeing errors in the host dmesg.
> Is this a known issue I am hitting with non-Linux guest oses and VT-d
> ? Appreciate any help in debugging this problem.
>
> -Shashidhar
>
> Linux kernel - 2.6.35
> kvm - 0.14.1
It'd be helpful to test on something newer too. Thanks,
Alex
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: FreeBSD guest with VTD NIC not passing traffic
[not found] ` <CADve3d6aTAEK8FqjxVuRnLGu+Efy1rmsb0n5H5rq1G0Eu1s6PA@mail.gmail.com>
@ 2012-01-13 20:26 ` Alex Williamson
0 siblings, 0 replies; 11+ messages in thread
From: Alex Williamson @ 2012-01-13 20:26 UTC (permalink / raw)
To: Shashidhar Patil; +Cc: kvm
On Fri, 2012-01-13 at 12:25 +0530, Shashidhar Patil wrote:
> Hi Alex,
> Thanks for your help. Some answers and further investigation details
> inline.
>
> On Wed, Jan 4, 2012 at 8:51 AM, Alex Williamson
> <alex.williamson@redhat.com>wrote:
>
> > On Mon, 2011-12-19 at 19:49 +0530, Shashidhar Patil wrote:
> > > Hi,
> > > I am running Ubuntu 10.10 (amd64) on a 2 socket nehalem based
> > > server with IOH 5520. 5520 supports VTD.
> > > I enabled DMAR with intel_iommu=on. The box has intel 82599 adapter
> > > which I assigned through VT-D to FreeBSD 8.2 running
> > > as guest os. The ixgbe driver detects the device and the driver
> > > successfully configures the device. But the link
> > > never comes up. It looks like link up/down interrupts are not
> > > delivered. Then I checked kvm interrupt assignment and as expected
> > > kvm could not make MSI-X entries for the VT-d guest. So no output from
> > > "grep kvm /proc/interrupt". By enabling some debugs in the
> > > qemu-kvm I figured out that the MSI-x updates are not received
> > > properly. It does look like Linux updates MSI-X table in a batch
> > > fashion
> > > which qemu-kvm gets in one shot and every thing works fine in case of
> > > linux. In case of FreeBSD PCIE updates come /MSI-X entry
> > > which qemu-kvm can't make use.
> >
> > That's right, Linux and Windows both seem to setup the MSI-X table then
> > enable it in one shot, so we only trigger the interrupt programming when
> > the enable bit is set. We don't trigger changes on writes to the MSI-X
> > table... not very accurate emulation of mask bits.
> >
> > > To overcome this I compiled latest intel ixgbe driver with MSI-X
> > > diabled. This time MSI interrupt got allocated both in the guest
> > > and in the qemu-kvm (host kernel). Still I could not get link UP. I
> > > modified the FreeBSD driver to poll for Link in some local_timer
> > > task. Link comes up but no traffic flows. The MAC statistics show
> > > packets received but packets do not reach the guest.
> > > DMAing of packet may be failing. I could not find out the reason.
> > > The same happens with legacy interrupt allocated in the guest. I even
> > > tried qemu-kvm prefer_msi=off.
> > >
> > > I think there are two problems1
> > > 1. Interrupt delivery either because of interrupt remapping failure or
> > > all out interrupt allocation failure in qemu-kvm
> > > 2. Packets not getting DMAed to the guest possibly because of some DMAR
> > issue.
> > >
> > > Before doing VT-d I made sure connections are fine and the adapter
> > > works fine in bare metal Linux.
> > > I also tried VT-d of 82599 with Linux as guest (both 32 bits(PAE and
> > > non-PAE) and amd64) and it just works magically.
> >
> > Unknowns: does the intel ixgbe driver work in FreeBSD with MSI-X
> >
>
> I compiled Linux-3.2 with with VTD and used qemu-kcm-014.1.
>
> I tried the below cases with FreeBSD stock driver and Intel official
> freebsd ixgbe-2.4.4 driver.
> I tried using only msi interrupt (hw.pci.enable_msix="0" in l) it still did
> not work.
> I tried using legacy pin interrupt (hw.pci.enable_msi="0" in l) but it did
> not work.
>
> The link was not coming up , the reason being no interrupts were received.
> I modified ixgbe_local_timer() function in intel driver to call
> ixgbe_check_link()
> so that link status is updated periodically . Now the link came up but
> still no traffic flow.
>
> My simple connectivity.
>
> Linux (ixgbe port) --------- (FreeBSD guest VTD ixgbe port)
>
> I added a static arp entry on Linux and did ping. The packets are seen by
> the ixgbe
> port of FreeBSD and I see ixgbe queue rx counts zero but
> mac_stats.good_pkt_rcvd
> shows the correct number of packets. This means the packets are reaching
> the mac but are
> not being sent to the h/w queue.
>
> There are three separate problems
> 1. MSI-X Interrupts handling for FreeBSD is broken because of the way
> FreeBSD allocates and update MSI-X tables.
> 2. Interrupt generation and/or delivery to the guest is not happening.
> 3. Packets received by mac are not DMAed and packets sent from FreBSD are
> not DMA into the the ports memory.
>
> problem 2 and 3 symptoms point to a problem in FreeBSD Driver.
> Problem 1 clearly seems to be a limitation on the qemu-kvm side which
> expects
> to receive MSI-X table in one shot.
Yep, this is somewhat caused by the underlying MSIX interface that we
work with. When we enable MSIX for the device we have to specify the
number of vectors. We can either do that based on the table size
specified in the MSIX capability or on the apparent number of vectors
the driver tries to program. We currently do the latter both because it
conserves unused MSI-X vectors and is usually a closer match to what the
guest driver is trying to do. If we had better masking support we might
be able to handle this better. Is this behavior unique to ixgbe freebsd
driver or to anything with MSI-X on freebsd? Thanks,
Alex
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: FreeBSD guest with VTD NIC not passing traffic
2012-01-04 3:21 ` Alex Williamson
[not found] ` <CADve3d6aTAEK8FqjxVuRnLGu+Efy1rmsb0n5H5rq1G0Eu1s6PA@mail.gmail.com>
@ 2012-01-13 21:00 ` Jan Kiszka
2012-01-13 21:05 ` Alex Williamson
1 sibling, 1 reply; 11+ messages in thread
From: Jan Kiszka @ 2012-01-13 21:00 UTC (permalink / raw)
To: Alex Williamson; +Cc: Shashidhar Patil, kvm
[-- Attachment #1: Type: text/plain, Size: 1545 bytes --]
On 2012-01-04 04:21, Alex Williamson wrote:
> On Mon, 2011-12-19 at 19:49 +0530, Shashidhar Patil wrote:
>> Hi,
>> I am running Ubuntu 10.10 (amd64) on a 2 socket nehalem based
>> server with IOH 5520. 5520 supports VTD.
>> I enabled DMAR with intel_iommu=on. The box has intel 82599 adapter
>> which I assigned through VT-D to FreeBSD 8.2 running
>> as guest os. The ixgbe driver detects the device and the driver
>> successfully configures the device. But the link
>> never comes up. It looks like link up/down interrupts are not
>> delivered. Then I checked kvm interrupt assignment and as expected
>> kvm could not make MSI-X entries for the VT-d guest. So no output from
>> "grep kvm /proc/interrupt". By enabling some debugs in the
>> qemu-kvm I figured out that the MSI-x updates are not received
>> properly. It does look like Linux updates MSI-X table in a batch
>> fashion
>> which qemu-kvm gets in one shot and every thing works fine in case of
>> linux. In case of FreeBSD PCIE updates come /MSI-X entry
>> which qemu-kvm can't make use.
>
> That's right, Linux and Windows both seem to setup the MSI-X table then
> enable it in one shot, so we only trigger the interrupt programming when
> the enable bit is set. We don't trigger changes on writes to the MSI-X
> table... not very accurate emulation of mask bits.
According to the PCI spec, updates that happen while a vector is
unmasked, need not be considered by the hardware (thus the hypervisor
here). Is that the scenario here?
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: FreeBSD guest with VTD NIC not passing traffic
2012-01-13 21:00 ` Jan Kiszka
@ 2012-01-13 21:05 ` Alex Williamson
2012-01-13 21:33 ` Jan Kiszka
0 siblings, 1 reply; 11+ messages in thread
From: Alex Williamson @ 2012-01-13 21:05 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Shashidhar Patil, kvm
On Fri, 2012-01-13 at 22:00 +0100, Jan Kiszka wrote:
> On 2012-01-04 04:21, Alex Williamson wrote:
> > On Mon, 2011-12-19 at 19:49 +0530, Shashidhar Patil wrote:
> >> Hi,
> >> I am running Ubuntu 10.10 (amd64) on a 2 socket nehalem based
> >> server with IOH 5520. 5520 supports VTD.
> >> I enabled DMAR with intel_iommu=on. The box has intel 82599 adapter
> >> which I assigned through VT-D to FreeBSD 8.2 running
> >> as guest os. The ixgbe driver detects the device and the driver
> >> successfully configures the device. But the link
> >> never comes up. It looks like link up/down interrupts are not
> >> delivered. Then I checked kvm interrupt assignment and as expected
> >> kvm could not make MSI-X entries for the VT-d guest. So no output from
> >> "grep kvm /proc/interrupt". By enabling some debugs in the
> >> qemu-kvm I figured out that the MSI-x updates are not received
> >> properly. It does look like Linux updates MSI-X table in a batch
> >> fashion
> >> which qemu-kvm gets in one shot and every thing works fine in case of
> >> linux. In case of FreeBSD PCIE updates come /MSI-X entry
> >> which qemu-kvm can't make use.
> >
> > That's right, Linux and Windows both seem to setup the MSI-X table then
> > enable it in one shot, so we only trigger the interrupt programming when
> > the enable bit is set. We don't trigger changes on writes to the MSI-X
> > table... not very accurate emulation of mask bits.
>
> According to the PCI spec, updates that happen while a vector is
> unmasked, need not be considered by the hardware (thus the hypervisor
> here). Is that the scenario here?
I'm assuming the vector is masked in the MSI-X table. So Linux/Windows
do:
a) program MSI-X table
b) enable MSI-X in capability register
Whereas FreeBSD does:
a) enable MSI-X in capability register (vectors masked in table)
b) program and unmask individual vectors
Thanks,
Alex
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: FreeBSD guest with VTD NIC not passing traffic
2012-01-13 21:05 ` Alex Williamson
@ 2012-01-13 21:33 ` Jan Kiszka
2012-01-13 21:56 ` Alex Williamson
0 siblings, 1 reply; 11+ messages in thread
From: Jan Kiszka @ 2012-01-13 21:33 UTC (permalink / raw)
To: Alex Williamson; +Cc: Shashidhar Patil, kvm
[-- Attachment #1: Type: text/plain, Size: 2270 bytes --]
On 2012-01-13 22:05, Alex Williamson wrote:
> On Fri, 2012-01-13 at 22:00 +0100, Jan Kiszka wrote:
>> On 2012-01-04 04:21, Alex Williamson wrote:
>>> On Mon, 2011-12-19 at 19:49 +0530, Shashidhar Patil wrote:
>>>> Hi,
>>>> I am running Ubuntu 10.10 (amd64) on a 2 socket nehalem based
>>>> server with IOH 5520. 5520 supports VTD.
>>>> I enabled DMAR with intel_iommu=on. The box has intel 82599 adapter
>>>> which I assigned through VT-D to FreeBSD 8.2 running
>>>> as guest os. The ixgbe driver detects the device and the driver
>>>> successfully configures the device. But the link
>>>> never comes up. It looks like link up/down interrupts are not
>>>> delivered. Then I checked kvm interrupt assignment and as expected
>>>> kvm could not make MSI-X entries for the VT-d guest. So no output from
>>>> "grep kvm /proc/interrupt". By enabling some debugs in the
>>>> qemu-kvm I figured out that the MSI-x updates are not received
>>>> properly. It does look like Linux updates MSI-X table in a batch
>>>> fashion
>>>> which qemu-kvm gets in one shot and every thing works fine in case of
>>>> linux. In case of FreeBSD PCIE updates come /MSI-X entry
>>>> which qemu-kvm can't make use.
>>>
>>> That's right, Linux and Windows both seem to setup the MSI-X table then
>>> enable it in one shot, so we only trigger the interrupt programming when
>>> the enable bit is set. We don't trigger changes on writes to the MSI-X
>>> table... not very accurate emulation of mask bits.
>>
>> According to the PCI spec, updates that happen while a vector is
>> unmasked, need not be considered by the hardware (thus the hypervisor
>> here). Is that the scenario here?
>
> I'm assuming the vector is masked in the MSI-X table. So Linux/Windows
> do:
>
> a) program MSI-X table
> b) enable MSI-X in capability register
>
> Whereas FreeBSD does:
>
> a) enable MSI-X in capability register (vectors masked in table)
> b) program and unmask individual vectors
That should work with the current code. It checks the number of vectors
on each config write, iterates the whole table, and then updates the
kernel configuration accordingly. It even requires the enable bit in the
cap register to be set before doing this.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: FreeBSD guest with VTD NIC not passing traffic
2012-01-13 21:33 ` Jan Kiszka
@ 2012-01-13 21:56 ` Alex Williamson
2012-01-13 22:15 ` Jan Kiszka
0 siblings, 1 reply; 11+ messages in thread
From: Alex Williamson @ 2012-01-13 21:56 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Shashidhar Patil, kvm
On Fri, 2012-01-13 at 22:33 +0100, Jan Kiszka wrote:
> On 2012-01-13 22:05, Alex Williamson wrote:
> > On Fri, 2012-01-13 at 22:00 +0100, Jan Kiszka wrote:
> >> On 2012-01-04 04:21, Alex Williamson wrote:
> >>> On Mon, 2011-12-19 at 19:49 +0530, Shashidhar Patil wrote:
> >>>> Hi,
> >>>> I am running Ubuntu 10.10 (amd64) on a 2 socket nehalem based
> >>>> server with IOH 5520. 5520 supports VTD.
> >>>> I enabled DMAR with intel_iommu=on. The box has intel 82599 adapter
> >>>> which I assigned through VT-D to FreeBSD 8.2 running
> >>>> as guest os. The ixgbe driver detects the device and the driver
> >>>> successfully configures the device. But the link
> >>>> never comes up. It looks like link up/down interrupts are not
> >>>> delivered. Then I checked kvm interrupt assignment and as expected
> >>>> kvm could not make MSI-X entries for the VT-d guest. So no output from
> >>>> "grep kvm /proc/interrupt". By enabling some debugs in the
> >>>> qemu-kvm I figured out that the MSI-x updates are not received
> >>>> properly. It does look like Linux updates MSI-X table in a batch
> >>>> fashion
> >>>> which qemu-kvm gets in one shot and every thing works fine in case of
> >>>> linux. In case of FreeBSD PCIE updates come /MSI-X entry
> >>>> which qemu-kvm can't make use.
> >>>
> >>> That's right, Linux and Windows both seem to setup the MSI-X table then
> >>> enable it in one shot, so we only trigger the interrupt programming when
> >>> the enable bit is set. We don't trigger changes on writes to the MSI-X
> >>> table... not very accurate emulation of mask bits.
> >>
> >> According to the PCI spec, updates that happen while a vector is
> >> unmasked, need not be considered by the hardware (thus the hypervisor
> >> here). Is that the scenario here?
> >
> > I'm assuming the vector is masked in the MSI-X table. So Linux/Windows
> > do:
> >
> > a) program MSI-X table
> > b) enable MSI-X in capability register
> >
> > Whereas FreeBSD does:
> >
> > a) enable MSI-X in capability register (vectors masked in table)
> > b) program and unmask individual vectors
>
> That should work with the current code. It checks the number of vectors
> on each config write, iterates the whole table, and then updates the
^^^^^^^^^^^^^^^^^^^^
> kernel configuration accordingly. It even requires the enable bit in the
> cap register to be set before doing this.
That's the problem, we only do it on config writes overlapping the MSI-X
flags. We don't do anything for writes to the MSI-X table. It might be
as simple as calling assigned_dev_update_msix() from msix_mmio_writel()
when the mask bit is toggled. I'm not sure what might fall out of that
though. Thanks,
Alex
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: FreeBSD guest with VTD NIC not passing traffic
2012-01-13 21:56 ` Alex Williamson
@ 2012-01-13 22:15 ` Jan Kiszka
2012-01-14 6:46 ` Shashidhar Patil
0 siblings, 1 reply; 11+ messages in thread
From: Jan Kiszka @ 2012-01-13 22:15 UTC (permalink / raw)
To: Alex Williamson; +Cc: Shashidhar Patil, kvm
[-- Attachment #1: Type: text/plain, Size: 3204 bytes --]
On 2012-01-13 22:56, Alex Williamson wrote:
> On Fri, 2012-01-13 at 22:33 +0100, Jan Kiszka wrote:
>> On 2012-01-13 22:05, Alex Williamson wrote:
>>> On Fri, 2012-01-13 at 22:00 +0100, Jan Kiszka wrote:
>>>> On 2012-01-04 04:21, Alex Williamson wrote:
>>>>> On Mon, 2011-12-19 at 19:49 +0530, Shashidhar Patil wrote:
>>>>>> Hi,
>>>>>> I am running Ubuntu 10.10 (amd64) on a 2 socket nehalem based
>>>>>> server with IOH 5520. 5520 supports VTD.
>>>>>> I enabled DMAR with intel_iommu=on. The box has intel 82599 adapter
>>>>>> which I assigned through VT-D to FreeBSD 8.2 running
>>>>>> as guest os. The ixgbe driver detects the device and the driver
>>>>>> successfully configures the device. But the link
>>>>>> never comes up. It looks like link up/down interrupts are not
>>>>>> delivered. Then I checked kvm interrupt assignment and as expected
>>>>>> kvm could not make MSI-X entries for the VT-d guest. So no output from
>>>>>> "grep kvm /proc/interrupt". By enabling some debugs in the
>>>>>> qemu-kvm I figured out that the MSI-x updates are not received
>>>>>> properly. It does look like Linux updates MSI-X table in a batch
>>>>>> fashion
>>>>>> which qemu-kvm gets in one shot and every thing works fine in case of
>>>>>> linux. In case of FreeBSD PCIE updates come /MSI-X entry
>>>>>> which qemu-kvm can't make use.
>>>>>
>>>>> That's right, Linux and Windows both seem to setup the MSI-X table then
>>>>> enable it in one shot, so we only trigger the interrupt programming when
>>>>> the enable bit is set. We don't trigger changes on writes to the MSI-X
>>>>> table... not very accurate emulation of mask bits.
>>>>
>>>> According to the PCI spec, updates that happen while a vector is
>>>> unmasked, need not be considered by the hardware (thus the hypervisor
>>>> here). Is that the scenario here?
>>>
>>> I'm assuming the vector is masked in the MSI-X table. So Linux/Windows
>>> do:
>>>
>>> a) program MSI-X table
>>> b) enable MSI-X in capability register
>>>
>>> Whereas FreeBSD does:
>>>
>>> a) enable MSI-X in capability register (vectors masked in table)
>>> b) program and unmask individual vectors
>>
>> That should work with the current code. It checks the number of vectors
>> on each config write, iterates the whole table, and then updates the
> ^^^^^^^^^^^^^^^^^^^^
>> kernel configuration accordingly. It even requires the enable bit in the
>> cap register to be set before doing this.
>
> That's the problem, we only do it on config writes overlapping the MSI-X
> flags. We don't do anything for writes to the MSI-X table. It might be
> as simple as calling assigned_dev_update_msix() from msix_mmio_writel()
> when the mask bit is toggled. I'm not sure what might fall out of that
> though.
Ah indeed. Now I recall to have fixed this in my MSI-X refactoring
series. I introduced config notifiers that are triggered by the MSI-X
layer on every relevant modification, and the device assignment code
hook the update function into this. I really need to dig into that
series soon again and refresh it.
In the meantime, we could try what you suggest (if the cap enable bit is
set).
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: FreeBSD guest with VTD NIC not passing traffic
2012-01-13 22:15 ` Jan Kiszka
@ 2012-01-14 6:46 ` Shashidhar Patil
2012-01-14 9:23 ` Shashidhar Patil
0 siblings, 1 reply; 11+ messages in thread
From: Shashidhar Patil @ 2012-01-14 6:46 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Alex Williamson, kvm
[-- Attachment #1: Type: text/plain, Size: 4836 bytes --]
Hi Alex,Jan,
I collected logs of pci updates processing of kvm(attached to this mail).
(I will try your suggestion soon)
The below source of Linux kernel shows the msix allocation done with
MSIX_ENABLE_FLAG
masked which works fine with kvm.
static int msix_capability_init(struct pci_dev *dev,
struct msix_entry *entries, int nvec)
{
int pos, ret;
u16 control;
void __iomem *base;
pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
pci_read_config_word(dev, pos + PCI_MSIX_FLAGS, &control);
/* Ensure MSI-X is disabled while it is set up */
control &= ~PCI_MSIX_FLAGS_ENABLE;
pci_write_config_word(dev, pos + PCI_MSIX_FLAGS, control);
/* Request & Map MSI-X table region */
base = msix_map_region(dev, pos, multi_msix_capable(control));
if (!base)
return -ENOMEM;
ret = msix_setup_entries(dev, pos, base, entries, nvec);
if (ret)
return ret;
ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
if (ret)
/*
* Some devices require MSI-X to be enabled before we can touch the
* MSI-X registers. We need to mask all the vectors to prevent
* interrupts coming in before they're fully set up.
*/
control |= PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE;
pci_write_config_word(dev, pos + PCI_MSIX_FLAGS, control);
On Sat, Jan 14, 2012 at 3:45 AM, Jan Kiszka <jan.kiszka@web.de> wrote:
> On 2012-01-13 22:56, Alex Williamson wrote:
>> On Fri, 2012-01-13 at 22:33 +0100, Jan Kiszka wrote:
>>> On 2012-01-13 22:05, Alex Williamson wrote:
>>>> On Fri, 2012-01-13 at 22:00 +0100, Jan Kiszka wrote:
>>>>> On 2012-01-04 04:21, Alex Williamson wrote:
>>>>>> On Mon, 2011-12-19 at 19:49 +0530, Shashidhar Patil wrote:
>>>>>>> Hi,
>>>>>>> I am running Ubuntu 10.10 (amd64) on a 2 socket nehalem based
>>>>>>> server with IOH 5520. 5520 supports VTD.
>>>>>>> I enabled DMAR with intel_iommu=on. The box has intel 82599 adapter
>>>>>>> which I assigned through VT-D to FreeBSD 8.2 running
>>>>>>> as guest os. The ixgbe driver detects the device and the driver
>>>>>>> successfully configures the device. But the link
>>>>>>> never comes up. It looks like link up/down interrupts are not
>>>>>>> delivered. Then I checked kvm interrupt assignment and as expected
>>>>>>> kvm could not make MSI-X entries for the VT-d guest. So no output from
>>>>>>> "grep kvm /proc/interrupt". By enabling some debugs in the
>>>>>>> qemu-kvm I figured out that the MSI-x updates are not received
>>>>>>> properly. It does look like Linux updates MSI-X table in a batch
>>>>>>> fashion
>>>>>>> which qemu-kvm gets in one shot and every thing works fine in case of
>>>>>>> linux. In case of FreeBSD PCIE updates come /MSI-X entry
>>>>>>> which qemu-kvm can't make use.
>>>>>>
>>>>>> That's right, Linux and Windows both seem to setup the MSI-X table then
>>>>>> enable it in one shot, so we only trigger the interrupt programming when
>>>>>> the enable bit is set. We don't trigger changes on writes to the MSI-X
>>>>>> table... not very accurate emulation of mask bits.
>>>>>
>>>>> According to the PCI spec, updates that happen while a vector is
>>>>> unmasked, need not be considered by the hardware (thus the hypervisor
>>>>> here). Is that the scenario here?
>>>>
>>>> I'm assuming the vector is masked in the MSI-X table. So Linux/Windows
>>>> do:
>>>>
>>>> a) program MSI-X table
>>>> b) enable MSI-X in capability register
>>>>
>>>> Whereas FreeBSD does:
>>>>
>>>> a) enable MSI-X in capability register (vectors masked in table)
>>>> b) program and unmask individual vectors
>>>
>>> That should work with the current code. It checks the number of vectors
>>> on each config write, iterates the whole table, and then updates the
>> ^^^^^^^^^^^^^^^^^^^^
>>> kernel configuration accordingly. It even requires the enable bit in the
>>> cap register to be set before doing this.
>>
>> That's the problem, we only do it on config writes overlapping the MSI-X
>> flags. We don't do anything for writes to the MSI-X table. It might be
>> as simple as calling assigned_dev_update_msix() from msix_mmio_writel()
>> when the mask bit is toggled. I'm not sure what might fall out of that
>> though.
>
> Ah indeed. Now I recall to have fixed this in my MSI-X refactoring
> series. I introduced config notifiers that are triggered by the MSI-X
> layer on every relevant modification, and the device assignment code
> hook the update function into this. I really need to dig into that
> series soon again and refresh it.
>
> In the meantime, we could try what you suggest (if the cap enable bit is
> set).
>
> Jan
>
[-- Attachment #2: kvm_bsd_msix.log --]
[-- Type: application/octet-stream, Size: 11608 bytes --]
assigned_dev_pci_read_config: (4.0): address=0044 val=0x00000008 len=2
assigned_dev_pci_write_config: (4.0): address=0010 val=0xf2080000 len=4
assigned_dev_pci_write_config: (4.0): address=0014 val=0x00000000 len=4
assigned_dev_pci_write_config: (4.0): address=0018 val=0x0000c081 len=4
assigned_dev_pci_write_config: (4.0): address=001c val=0x00000000 len=4
assigned_dev_pci_write_config: (4.0): address=0020 val=0xf2100000 len=4
assigned_dev_pci_write_config: (4.0): address=0024 val=0x00000000 len=4
assigned_dev_pci_write_config: (4.0): address=0030 val=0xf2180000 len=4
assigned_dev_pci_write_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: NON BAR (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: (4.0): address=003c val=0x0000000b len=1
assigned_dev_pci_write_config: (4.0): address=003d val=0x00000001 len=1
assigned_dev_pci_write_config: (4.0): address=003e val=0x00000000 len=1
assigned_dev_pci_write_config: NON BAR (4.0): address=003e val=0x00000000 len=1
assigned_dev_pci_write_config: (4.0): address=003f val=0x00000000 len=1
assigned_dev_pci_write_config: NON BAR (4.0): address=003f val=0x00000000 len=1
assigned_dev_pci_write_config: (4.0): address=000c val=0x00000010 len=1
assigned_dev_pci_write_config: NON BAR (4.0): address=000c val=0x00000010 len=1
assigned_dev_pci_write_config: (4.0): address=000d val=0x00000000 len=1
assigned_dev_pci_write_config: NON BAR (4.0): address=000d val=0x00000000 len=1
assigned_dev_pci_write_config: (4.0): address=0009 val=0x00000000 len=1
assigned_dev_pci_write_config: NON BAR (4.0): address=0009 val=0x00000000 len=1
assigned_dev_pci_write_config: (4.0): address=0008 val=0x00000001 len=1
assigned_dev_pci_write_config: NON BAR (4.0): address=0008 val=0x00000001 len=1
assigned_dev_pci_write_config: (4.0): address=0052 val=0x00000000 len=2
Received PCI update
Assigning legacy interrupt
assigned_dev_pci_write_config: (4.0): address=0072 val=0x0000003f len=2
Assigning legacy interrupt
assigned_dev_pci_read_config: (4.0): address=0000 val=0x00008086 len=2
assigned_dev_pci_read_config: (4.0): address=0002 val=0x000010fb len=2
assigned_dev_pci_read_config: (4.0): address=002c val=0x00008086 len=2
assigned_dev_pci_read_config: (4.0): address=002e val=0x00007a11 len=2
assigned_dev_pci_read_config: (4.0): address=0000 val=0x00008086 len=2
assigned_dev_pci_read_config: (4.0): address=0002 val=0x000010fb len=2
assigned_dev_pci_read_config: (4.0): address=002c val=0x00008086 len=2
assigned_dev_pci_read_config: (4.0): address=002e val=0x00007a11 len=2
assigned_dev_pci_read_config: (4.0): address=0000 val=0x10fb8086 len=4
assigned_dev_pci_read_config: (4.0): address=0008 val=0x00000001 len=1
assigned_dev_pci_read_config: (4.0): address=002c val=0x00008086 len=2
assigned_dev_pci_read_config: (4.0): address=002e val=0x00007a11 len=2
assigned_dev_pci_read_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: NON BAR (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_read_config: (4.0): address=001c val=0x00000000 len=4
assigned_dev_pci_read_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: (4.0): address=0004 val=0x00000401 len=2
assigned_dev_pci_write_config: NON BAR (4.0): address=0004 val=0x00000401 len=2
assigned_dev_pci_write_config: (4.0): address=001c val=0xffffffff len=4
assigned_dev_pci_read_config: (4.0): address=001c val=0x00000000 len=4
assigned_dev_pci_write_config: (4.0): address=001c val=0x00000000 len=4
assigned_dev_pci_write_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_iomem_map: e_phys=f2080000 r_virt=0x7f9e5407f000 type=0 len=00080000 region_num=0
assigned_dev_iomem_map: e_phys=f2100000 r_virt=0x7f9e992bc000 type=0 len=00004000 region_num=4
assigned_dev_pci_write_config: NON BAR (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_read_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: NON BAR (4.0): address=0004 val=0x00000403 len=2
msix_mmio_writel: write to MSI-X entry table mmio offset 0xc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x1c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x2c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x3c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x4c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x5c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x6c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x7c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x8c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x9c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0xac, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0xbc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0xcc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0xdc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0xec, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0xfc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x10c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x11c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x12c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x13c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x14c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x15c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x16c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x17c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x18c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x19c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x1ac, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x1bc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x1cc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x1dc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x1ec, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x1fc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x20c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x21c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x22c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x23c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x24c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x25c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x26c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x27c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x28c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x29c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x2ac, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x2bc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x2cc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x2dc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x2ec, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x2fc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x30c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x31c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x32c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x33c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x34c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x35c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x36c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x37c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x38c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x39c, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x3ac, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x3bc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x3cc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x3dc, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x3ec, val 0x1
msix_mmio_writel: write to MSI-X entry table mmio offset 0x3fc, val 0x1
assigned_dev_pci_write_config: (4.0): address=0072 val=0x0000803f len=2
the MSIX capabilty position is 0x70
the MSIX entries_max_nr is 0x40
MSI-X entry number is zero!
assigned_dev_update_msix_mmio: Interrupted system call
msix_mmio_writel: write to MSI-X entry table mmio offset 0x0, val 0xfee00000
msix_mmio_writel: write to MSI-X entry table mmio offset 0x4, val 0x0
msix_mmio_writel: write to MSI-X entry table mmio offset 0x8, val 0x30
msix_mmio_writel: write to MSI-X entry table mmio offset 0xc, val 0x0
assigned_dev_pci_read_config: (4.0): address=0004 val=0x00000003 len=2
assigned_dev_pci_write_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: NON BAR (4.0): address=0004 val=0x00000403 len=2
msix_mmio_writel: write to MSI-X entry table mmio offset 0x10, val 0xfee01000
msix_mmio_writel: write to MSI-X entry table mmio offset 0x14, val 0x0
msix_mmio_writel: write to MSI-X entry table mmio offset 0x18, val 0x32
msix_mmio_writel: write to MSI-X entry table mmio offset 0x1c, val 0x0
assigned_dev_pci_read_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: NON BAR (4.0): address=0004 val=0x00000403 len=2
msix_mmio_writel: write to MSI-X entry table mmio offset 0x20, val 0xfee02000
msix_mmio_writel: write to MSI-X entry table mmio offset 0x24, val 0x0
msix_mmio_writel: write to MSI-X entry table mmio offset 0x28, val 0x32
msix_mmio_writel: write to MSI-X entry table mmio offset 0x2c, val 0x0
assigned_dev_pci_read_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: NON BAR (4.0): address=0004 val=0x00000403 len=2
msix_mmio_writel: write to MSI-X entry table mmio offset 0x30, val 0xfee03000
msix_mmio_writel: write to MSI-X entry table mmio offset 0x34, val 0x0
msix_mmio_writel: write to MSI-X entry table mmio offset 0x38, val 0x32
msix_mmio_writel: write to MSI-X entry table mmio offset 0x3c, val 0x0
assigned_dev_pci_read_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: NON BAR (4.0): address=0004 val=0x00000403 len=2
msix_mmio_writel: write to MSI-X entry table mmio offset 0x40, val 0xfee00000
msix_mmio_writel: write to MSI-X entry table mmio offset 0x44, val 0x0
msix_mmio_writel: write to MSI-X entry table mmio offset 0x48, val 0x31
msix_mmio_writel: write to MSI-X entry table mmio offset 0x4c, val 0x0
assigned_dev_pci_read_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_write_config: NON BAR (4.0): address=0004 val=0x00000403 len=2
assigned_dev_pci_read_config: (4.0): address=00b2 val=0x00000042 len=2
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: FreeBSD guest with VTD NIC not passing traffic
2012-01-14 6:46 ` Shashidhar Patil
@ 2012-01-14 9:23 ` Shashidhar Patil
2012-01-14 20:25 ` Shashidhar Patil
0 siblings, 1 reply; 11+ messages in thread
From: Shashidhar Patil @ 2012-01-14 9:23 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Alex Williamson, kvm
Hi Alex,Jan,
I forgot mention that in case of MSI-X failure I do not see any
interrupts being allocated
by the Host (kvm module). grep kvm /proc/interrupts is empty.
-Shashidhar
On Sat, Jan 14, 2012 at 12:16 PM, Shashidhar Patil
<shashidhar.patil@gmail.com> wrote:
> Hi Alex,Jan,
> I collected logs of pci updates processing of kvm(attached to this mail).
> (I will try your suggestion soon)
>
> The below source of Linux kernel shows the msix allocation done with
> MSIX_ENABLE_FLAG
> masked which works fine with kvm.
>
> static int msix_capability_init(struct pci_dev *dev,
> struct msix_entry *entries, int nvec)
> {
> int pos, ret;
> u16 control;
> void __iomem *base;
>
> pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
> pci_read_config_word(dev, pos + PCI_MSIX_FLAGS, &control);
>
> /* Ensure MSI-X is disabled while it is set up */
> control &= ~PCI_MSIX_FLAGS_ENABLE;
> pci_write_config_word(dev, pos + PCI_MSIX_FLAGS, control);
>
> /* Request & Map MSI-X table region */
> base = msix_map_region(dev, pos, multi_msix_capable(control));
> if (!base)
> return -ENOMEM;
>
> ret = msix_setup_entries(dev, pos, base, entries, nvec);
> if (ret)
> return ret;
>
> ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
> if (ret)
> /*
> * Some devices require MSI-X to be enabled before we can touch the
> * MSI-X registers. We need to mask all the vectors to prevent
> * interrupts coming in before they're fully set up.
> */
> control |= PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE;
> pci_write_config_word(dev, pos + PCI_MSIX_FLAGS, control);
>
> On Sat, Jan 14, 2012 at 3:45 AM, Jan Kiszka <jan.kiszka@web.de> wrote:
>> On 2012-01-13 22:56, Alex Williamson wrote:
>>> On Fri, 2012-01-13 at 22:33 +0100, Jan Kiszka wrote:
>>>> On 2012-01-13 22:05, Alex Williamson wrote:
>>>>> On Fri, 2012-01-13 at 22:00 +0100, Jan Kiszka wrote:
>>>>>> On 2012-01-04 04:21, Alex Williamson wrote:
>>>>>>> On Mon, 2011-12-19 at 19:49 +0530, Shashidhar Patil wrote:
>>>>>>>> Hi,
>>>>>>>> I am running Ubuntu 10.10 (amd64) on a 2 socket nehalem based
>>>>>>>> server with IOH 5520. 5520 supports VTD.
>>>>>>>> I enabled DMAR with intel_iommu=on. The box has intel 82599 adapter
>>>>>>>> which I assigned through VT-D to FreeBSD 8.2 running
>>>>>>>> as guest os. The ixgbe driver detects the device and the driver
>>>>>>>> successfully configures the device. But the link
>>>>>>>> never comes up. It looks like link up/down interrupts are not
>>>>>>>> delivered. Then I checked kvm interrupt assignment and as expected
>>>>>>>> kvm could not make MSI-X entries for the VT-d guest. So no output from
>>>>>>>> "grep kvm /proc/interrupt". By enabling some debugs in the
>>>>>>>> qemu-kvm I figured out that the MSI-x updates are not received
>>>>>>>> properly. It does look like Linux updates MSI-X table in a batch
>>>>>>>> fashion
>>>>>>>> which qemu-kvm gets in one shot and every thing works fine in case of
>>>>>>>> linux. In case of FreeBSD PCIE updates come /MSI-X entry
>>>>>>>> which qemu-kvm can't make use.
>>>>>>>
>>>>>>> That's right, Linux and Windows both seem to setup the MSI-X table then
>>>>>>> enable it in one shot, so we only trigger the interrupt programming when
>>>>>>> the enable bit is set. We don't trigger changes on writes to the MSI-X
>>>>>>> table... not very accurate emulation of mask bits.
>>>>>>
>>>>>> According to the PCI spec, updates that happen while a vector is
>>>>>> unmasked, need not be considered by the hardware (thus the hypervisor
>>>>>> here). Is that the scenario here?
>>>>>
>>>>> I'm assuming the vector is masked in the MSI-X table. So Linux/Windows
>>>>> do:
>>>>>
>>>>> a) program MSI-X table
>>>>> b) enable MSI-X in capability register
>>>>>
>>>>> Whereas FreeBSD does:
>>>>>
>>>>> a) enable MSI-X in capability register (vectors masked in table)
>>>>> b) program and unmask individual vectors
>>>>
>>>> That should work with the current code. It checks the number of vectors
>>>> on each config write, iterates the whole table, and then updates the
>>> ^^^^^^^^^^^^^^^^^^^^
>>>> kernel configuration accordingly. It even requires the enable bit in the
>>>> cap register to be set before doing this.
>>>
>>> That's the problem, we only do it on config writes overlapping the MSI-X
>>> flags. We don't do anything for writes to the MSI-X table. It might be
>>> as simple as calling assigned_dev_update_msix() from msix_mmio_writel()
>>> when the mask bit is toggled. I'm not sure what might fall out of that
>>> though.
>>
>> Ah indeed. Now I recall to have fixed this in my MSI-X refactoring
>> series. I introduced config notifiers that are triggered by the MSI-X
>> layer on every relevant modification, and the device assignment code
>> hook the update function into this. I really need to dig into that
>> series soon again and refresh it.
>>
>> In the meantime, we could try what you suggest (if the cap enable bit is
>> set).
>>
>> Jan
>>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: FreeBSD guest with VTD NIC not passing traffic
2012-01-14 9:23 ` Shashidhar Patil
@ 2012-01-14 20:25 ` Shashidhar Patil
0 siblings, 0 replies; 11+ messages in thread
From: Shashidhar Patil @ 2012-01-14 20:25 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Alex Williamson, kvm
Hi Alex,Jan,
I put the hack of calling asssign_dev_msix_update() in mmio_msix_writel()
when the last verctor of msix is updated. Now msix interrupt allocation happens
as expected in the Host (Linux). But this is a ugly hack of hardcoding an
address since we don't know how many msix vectors are configured by guest.
Unlike Linux, BSD first enables MSIX by writing to PCIE MSIX
control word, allocates
vectors but does write vector data to PCI MSI_X tables. Later as part
of setup_intr
processing it write vector data to MSI-X table.
Since KVM is intercepting only PCI MSI-X addresses (and not MSI-X
table) it does not
find any vectors when the MSIX_ENABLE is seen so it does not allocate
MSI-X vectors
in host.
Even with MSI-X interrupts fix I neither see link coming up nor I
see packet movement.
It looks like the driver is badly broken in Virtualization case.
Tx
-Shashidhar
On Sat, Jan 14, 2012 at 2:53 PM, Shashidhar Patil
<shashidhar.patil@gmail.com> wrote:
> Hi Alex,Jan,
> I forgot mention that in case of MSI-X failure I do not see any
> interrupts being allocated
> by the Host (kvm module). grep kvm /proc/interrupts is empty.
>
> -Shashidhar
>
> On Sat, Jan 14, 2012 at 12:16 PM, Shashidhar Patil
> <shashidhar.patil@gmail.com> wrote:
>> Hi Alex,Jan,
>> I collected logs of pci updates processing of kvm(attached to this mail).
>> (I will try your suggestion soon)
>>
>> The below source of Linux kernel shows the msix allocation done with
>> MSIX_ENABLE_FLAG
>> masked which works fine with kvm.
>>
>> static int msix_capability_init(struct pci_dev *dev,
>> struct msix_entry *entries, int nvec)
>> {
>> int pos, ret;
>> u16 control;
>> void __iomem *base;
>>
>> pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
>> pci_read_config_word(dev, pos + PCI_MSIX_FLAGS, &control);
>>
>> /* Ensure MSI-X is disabled while it is set up */
>> control &= ~PCI_MSIX_FLAGS_ENABLE;
>> pci_write_config_word(dev, pos + PCI_MSIX_FLAGS, control);
>>
>> /* Request & Map MSI-X table region */
>> base = msix_map_region(dev, pos, multi_msix_capable(control));
>> if (!base)
>> return -ENOMEM;
>>
>> ret = msix_setup_entries(dev, pos, base, entries, nvec);
>> if (ret)
>> return ret;
>>
>> ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
>> if (ret)
>> /*
>> * Some devices require MSI-X to be enabled before we can touch the
>> * MSI-X registers. We need to mask all the vectors to prevent
>> * interrupts coming in before they're fully set up.
>> */
>> control |= PCI_MSIX_FLAGS_MASKALL | PCI_MSIX_FLAGS_ENABLE;
>> pci_write_config_word(dev, pos + PCI_MSIX_FLAGS, control);
>>
>> On Sat, Jan 14, 2012 at 3:45 AM, Jan Kiszka <jan.kiszka@web.de> wrote:
>>> On 2012-01-13 22:56, Alex Williamson wrote:
>>>> On Fri, 2012-01-13 at 22:33 +0100, Jan Kiszka wrote:
>>>>> On 2012-01-13 22:05, Alex Williamson wrote:
>>>>>> On Fri, 2012-01-13 at 22:00 +0100, Jan Kiszka wrote:
>>>>>>> On 2012-01-04 04:21, Alex Williamson wrote:
>>>>>>>> On Mon, 2011-12-19 at 19:49 +0530, Shashidhar Patil wrote:
>>>>>>>>> Hi,
>>>>>>>>> I am running Ubuntu 10.10 (amd64) on a 2 socket nehalem based
>>>>>>>>> server with IOH 5520. 5520 supports VTD.
>>>>>>>>> I enabled DMAR with intel_iommu=on. The box has intel 82599 adapter
>>>>>>>>> which I assigned through VT-D to FreeBSD 8.2 running
>>>>>>>>> as guest os. The ixgbe driver detects the device and the driver
>>>>>>>>> successfully configures the device. But the link
>>>>>>>>> never comes up. It looks like link up/down interrupts are not
>>>>>>>>> delivered. Then I checked kvm interrupt assignment and as expected
>>>>>>>>> kvm could not make MSI-X entries for the VT-d guest. So no output from
>>>>>>>>> "grep kvm /proc/interrupt". By enabling some debugs in the
>>>>>>>>> qemu-kvm I figured out that the MSI-x updates are not received
>>>>>>>>> properly. It does look like Linux updates MSI-X table in a batch
>>>>>>>>> fashion
>>>>>>>>> which qemu-kvm gets in one shot and every thing works fine in case of
>>>>>>>>> linux. In case of FreeBSD PCIE updates come /MSI-X entry
>>>>>>>>> which qemu-kvm can't make use.
>>>>>>>>
>>>>>>>> That's right, Linux and Windows both seem to setup the MSI-X table then
>>>>>>>> enable it in one shot, so we only trigger the interrupt programming when
>>>>>>>> the enable bit is set. We don't trigger changes on writes to the MSI-X
>>>>>>>> table... not very accurate emulation of mask bits.
>>>>>>>
>>>>>>> According to the PCI spec, updates that happen while a vector is
>>>>>>> unmasked, need not be considered by the hardware (thus the hypervisor
>>>>>>> here). Is that the scenario here?
>>>>>>
>>>>>> I'm assuming the vector is masked in the MSI-X table. So Linux/Windows
>>>>>> do:
>>>>>>
>>>>>> a) program MSI-X table
>>>>>> b) enable MSI-X in capability register
>>>>>>
>>>>>> Whereas FreeBSD does:
>>>>>>
>>>>>> a) enable MSI-X in capability register (vectors masked in table)
>>>>>> b) program and unmask individual vectors
>>>>>
>>>>> That should work with the current code. It checks the number of vectors
>>>>> on each config write, iterates the whole table, and then updates the
>>>> ^^^^^^^^^^^^^^^^^^^^
>>>>> kernel configuration accordingly. It even requires the enable bit in the
>>>>> cap register to be set before doing this.
>>>>
>>>> That's the problem, we only do it on config writes overlapping the MSI-X
>>>> flags. We don't do anything for writes to the MSI-X table. It might be
>>>> as simple as calling assigned_dev_update_msix() from msix_mmio_writel()
>>>> when the mask bit is toggled. I'm not sure what might fall out of that
>>>> though.
>>>
>>> Ah indeed. Now I recall to have fixed this in my MSI-X refactoring
>>> series. I introduced config notifiers that are triggered by the MSI-X
>>> layer on every relevant modification, and the device assignment code
>>> hook the update function into this. I really need to dig into that
>>> series soon again and refresh it.
>>>
>>> In the meantime, we could try what you suggest (if the cap enable bit is
>>> set).
>>>
>>> Jan
>>>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-01-14 20:25 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-19 14:19 FreeBSD guest with VTD NIC not passing traffic Shashidhar Patil
2012-01-04 3:21 ` Alex Williamson
[not found] ` <CADve3d6aTAEK8FqjxVuRnLGu+Efy1rmsb0n5H5rq1G0Eu1s6PA@mail.gmail.com>
2012-01-13 20:26 ` Alex Williamson
2012-01-13 21:00 ` Jan Kiszka
2012-01-13 21:05 ` Alex Williamson
2012-01-13 21:33 ` Jan Kiszka
2012-01-13 21:56 ` Alex Williamson
2012-01-13 22:15 ` Jan Kiszka
2012-01-14 6:46 ` Shashidhar Patil
2012-01-14 9:23 ` Shashidhar Patil
2012-01-14 20:25 ` Shashidhar Patil
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox