netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Disabling msix interrupts
@ 2017-02-06 15:33 David Laight
  2017-02-06 16:26 ` Alexander Duyck
  0 siblings, 1 reply; 6+ messages in thread
From: David Laight @ 2017-02-06 15:33 UTC (permalink / raw)
  To: netdev@vger.kernel.org

netdev probably isn't the right list for this, but I suspect people
reading it understand what happens.

I'm fairly sure that an msix interrupt can get raised after
the kernel thinks it has masked it.

When an msix interrupt is disabled I think msi_set_mask_bit()
(in drivers/pci/msi.c) is called to write a '1' to the card's
hardware MSIX mask register (the last 32bit word of the entry).
This function carefully reads back the mask register to flush
the write through the pcie bus.
Except it doesn't, it reads the 'address_lo' register instead! [1]

While this will stop the hardware raising any more interrupts,
it could easily be in the process of raising one.
ie have read the mask, found it zero, read the address and
data, and be in the process of issuing the pcie write.

The pcie write (to disable the interrupt) and readback are seen
by the hardware as (more or less) back to back transfers, so can
both easily overtake the request to raise the interrupt.
The pcie bus is also allowed to make a read completion tlp
overtake a write tlp.
Add in any host-side delays in raising the hardware interrupt
itself, and an interrupt could happen well after it was masked.

More worrying would be any code that tries to change the address
and data associated with an interrupt.
You'd need moderate guard times after the disable and before the
enable to ensure the hardware didn't raise an interrupt with
a mismatch of the old and new values.

[1] Maybe I'll look at the order those cycles actually arrive in.

	David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Disabling msix interrupts
  2017-02-06 15:33 Disabling msix interrupts David Laight
@ 2017-02-06 16:26 ` Alexander Duyck
  2017-02-06 17:23   ` David Laight
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Duyck @ 2017-02-06 16:26 UTC (permalink / raw)
  To: David Laight; +Cc: netdev@vger.kernel.org, linux-pci@vger.kernel.org

On Mon, Feb 6, 2017 at 7:33 AM, David Laight <David.Laight@aculab.com> wrote:
> netdev probably isn't the right list for this, but I suspect people
> reading it understand what happens.
>
> I'm fairly sure that an msix interrupt can get raised after
> the kernel thinks it has masked it.
>
> When an msix interrupt is disabled I think msi_set_mask_bit()
> (in drivers/pci/msi.c) is called to write a '1' to the card's
> hardware MSIX mask register (the last 32bit word of the entry).
> This function carefully reads back the mask register to flush
> the write through the pcie bus.
> Except it doesn't, it reads the 'address_lo' register instead! [1]

Reading any register will force the write to be flushed as all PCI
writes must be completed before a PCI read.  For example in the Intel
drivers we read register 0 to flush a write of any of the other
registers.

> While this will stop the hardware raising any more interrupts,
> it could easily be in the process of raising one.
> ie have read the mask, found it zero, read the address and
> data, and be in the process of issuing the pcie write.
>
> The pcie write (to disable the interrupt) and readback are seen
> by the hardware as (more or less) back to back transfers, so can
> both easily overtake the request to raise the interrupt.

I don't believe this is correct.  On the PCI bus what you should see
is the device aware that interrupts are disabled before the completion
arrives and any MSI messages should arrive before the last read
completion.

> The pcie bus is also allowed to make a read completion tlp
> overtake a write tlp.
> Add in any host-side delays in raising the hardware interrupt
> itself, and an interrupt could happen well after it was masked.

That is only if relaxed ordering is enabled, which I am not sure is
supported for MSI-X interrupts.  That being said though I believe
there are some platforms that could end up seeing a delay in handling
the interrupt if for example the CPU the interrupt was delivered to
was in a sleep state.

> More worrying would be any code that tries to change the address
> and data associated with an interrupt.
> You'd need moderate guard times after the disable and before the
> enable to ensure the hardware didn't raise an interrupt with
> a mismatch of the old and new values.
>
> [1] Maybe I'll look at the order those cycles actually arrive in.
>
>         David

So I have thrown in my $.02 on this, and added the linux-pci mailing
list.  That is a much better place to bring this up rather than
netdev.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Disabling msix interrupts
  2017-02-06 16:26 ` Alexander Duyck
@ 2017-02-06 17:23   ` David Laight
  2017-02-06 19:14     ` David Miller
  0 siblings, 1 reply; 6+ messages in thread
From: David Laight @ 2017-02-06 17:23 UTC (permalink / raw)
  To: 'Alexander Duyck'
  Cc: netdev@vger.kernel.org, linux-pci@vger.kernel.org

From: Alexander 
> Sent: 06 February 2017 16:27
> To: David Laight
> On Mon, Feb 6, 2017 at 7:33 AM, David Laight <David.Laight@aculab.com> wrote:
> > netdev probably isn't the right list for this, but I suspect people
> > reading it understand what happens.
> >
> > I'm fairly sure that an msix interrupt can get raised after
> > the kernel thinks it has masked it.
> >
> > When an msix interrupt is disabled I think msi_set_mask_bit()
> > (in drivers/pci/msi.c) is called to write a '1' to the card's
> > hardware MSIX mask register (the last 32bit word of the entry).
> > This function carefully reads back the mask register to flush
> > the write through the pcie bus.
> > Except it doesn't, it reads the 'address_lo' register instead! [1]
> 
> Reading any register will force the write to be flushed as all PCI
> writes must be completed before a PCI read.  For example in the Intel
> drivers we read register 0 to flush a write of any of the other
> registers.

Sorry brain fade on that one.
Although the 'store buffer' on the sparc cpus I used to use would
let reads overtake writes. So you did have to read back the address
of the last write - not sure about modern sparc cpus.

> > While this will stop the hardware raising any more interrupts,
> > it could easily be in the process of raising one.
> > ie have read the mask, found it zero, read the address and
> > data, and be in the process of issuing the pcie write.
> >
> > The pcie write (to disable the interrupt) and readback are seen
> > by the hardware as (more or less) back to back transfers, so can
> > both easily overtake the request to raise the interrupt.
> 
> I don't believe this is correct.  On the PCI bus what you should see
> is the device aware that interrupts are disabled before the completion
> arrives and any MSI messages should arrive before the last read
> completion.

You are making the assumption that it takes the hardware zero
time to raise msix interrupts, and/or that the hardware logic that
raises the interrupts is tightly coupled with that which generates
the read completions.

The fpga logic we use for raising interrupts reads the mask last,
but then has to do three 'bus' cycles to get the 64bit address
configured and request the pcie write.
Meanwhile the pcie <=> msix table logic will be running asynchronously
so will quite easily complete the readback before the request itself
is sent.
I'm pretty sure there is enough async in the 'write' path that even if
I aborted raising the interrupt if there was a write to the msix table
the read completion could still overtake the write.

> > The pcie bus is also allowed to make a read completion tlp
> > overtake a write tlp.
> > Add in any host-side delays in raising the hardware interrupt
> > itself, and an interrupt could happen well after it was masked.
> 
> That is only if relaxed ordering is enabled, which I am not sure is
> supported for MSI-X interrupts.

The PCIE book on my desk is not helpful wrt reordering.
I know reads and read completion can be reordered - I fixed the fpga
logic (from the manufacturer) so that it correctly processed interleaved
read completions.
I don't see anything obvious that would require writes and read completions
to be kept in order.

> That being said though I believe
> there are some platforms that could end up seeing a delay in handling
> the interrupt if for example the CPU the interrupt was delivered to
> was in a sleep state.
>
> > More worrying would be any code that tries to change the address
> > and data associated with an interrupt.
> > You'd need moderate guard times after the disable and before the
> > enable to ensure the hardware didn't raise an interrupt with
> > a mismatch of the old and new values.
> >
> > [1] Maybe I'll look at the order those cycles actually arrive in.
> >
> >         David
> 
> So I have thrown in my $.02 on this, and added the linux-pci mailing
> list.  That is a much better place to bring this up rather than
> netdev.
> 
> Thanks.
> 
> - Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Disabling msix interrupts
  2017-02-06 17:23   ` David Laight
@ 2017-02-06 19:14     ` David Miller
  2017-02-07  9:55       ` David Laight
  0 siblings, 1 reply; 6+ messages in thread
From: David Miller @ 2017-02-06 19:14 UTC (permalink / raw)
  To: David.Laight; +Cc: alexander.duyck, netdev, linux-pci

From: David Laight <David.Laight@ACULAB.COM>
Date: Mon, 6 Feb 2017 17:23:54 +0000

> Although the 'store buffer' on the sparc cpus I used to use would
> let reads overtake writes. So you did have to read back the address
> of the last write - not sure about modern sparc cpus.

Never would any sparc cpu do so when any of the operations involved
were to "side effect" locations, as PCI config space is.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Disabling msix interrupts
  2017-02-06 19:14     ` David Miller
@ 2017-02-07  9:55       ` David Laight
  2017-02-07 16:26         ` David Miller
  0 siblings, 1 reply; 6+ messages in thread
From: David Laight @ 2017-02-07  9:55 UTC (permalink / raw)
  To: 'David Miller'
  Cc: alexander.duyck@gmail.com, netdev@vger.kernel.org,
	linux-pci@vger.kernel.org

From: David Miller
> Sent: 06 February 2017 19:15
> From: David Laight <David.Laight@ACULAB.COM>
> Date: Mon, 6 Feb 2017 17:23:54 +0000
> 
> > Although the 'store buffer' on the sparc cpus I used to use would
> > let reads overtake writes. So you did have to read back the address
> > of the last write - not sure about modern sparc cpus.
> 
> Never would any sparc cpu do so when any of the operations involved
> were to "side effect" locations, as PCI config space is.

I guess they used non-zero ASI, and that forced the flush??
Normal uncached memory reads would overtake writes.
(These were SuperSparc (Viking)).

	David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Disabling msix interrupts
  2017-02-07  9:55       ` David Laight
@ 2017-02-07 16:26         ` David Miller
  0 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2017-02-07 16:26 UTC (permalink / raw)
  To: David.Laight; +Cc: alexander.duyck, netdev, linux-pci

From: David Laight <David.Laight@ACULAB.COM>
Date: Tue, 7 Feb 2017 09:55:47 +0000

> From: David Miller
>> Sent: 06 February 2017 19:15
>> From: David Laight <David.Laight@ACULAB.COM>
>> Date: Mon, 6 Feb 2017 17:23:54 +0000
>> 
>> > Although the 'store buffer' on the sparc cpus I used to use would
>> > let reads overtake writes. So you did have to read back the address
>> > of the last write - not sure about modern sparc cpus.
>> 
>> Never would any sparc cpu do so when any of the operations involved
>> were to "side effect" locations, as PCI config space is.
> 
> I guess they used non-zero ASI, and that forced the flush??
> Normal uncached memory reads would overtake writes.
> (These were SuperSparc (Viking)).

On sun4m it was controlled by bits in the physical address.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-02-07 16:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-02-06 15:33 Disabling msix interrupts David Laight
2017-02-06 16:26 ` Alexander Duyck
2017-02-06 17:23   ` David Laight
2017-02-06 19:14     ` David Miller
2017-02-07  9:55       ` David Laight
2017-02-07 16:26         ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).