public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
* PCIe cycle sequence when updating the msi-x table
@ 2023-04-07 16:17 David Laight
  2023-04-07 19:55 ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: David Laight @ 2023-04-07 16:17 UTC (permalink / raw)
  To: Thomas Gleixner, Jason Gunthorpe, Bjorn Helgaas,
	Christoph Hellwig, linux-pci@vger.kernel.org

The function that updates the MSI-X table currently reads:

static inline void pci_write_msg_msix(struct msi_desc *desc, struct msi_msg *msg)
{
	void __iomem *base = pci_msix_desc_addr(desc);
	u32 ctrl = desc->pci.msix_ctrl;
	bool unmasked = !(ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT);

	if (desc->pci.msi_attrib.is_virtual)
		return;
	/*
	 * The specification mandates that the entry is masked
	 * when the message is modified:
	 *
	 * "If software changes the Address or Data value of an
	 * entry while the entry is unmasked, the result is
	 * undefined."
	 */
	if (unmasked)
		pci_msix_write_vector_ctrl(desc, ctrl | PCI_MSIX_ENTRY_CTRL_MASKBIT);

	writel(msg->address_lo, base + PCI_MSIX_ENTRY_LOWER_ADDR);
	writel(msg->address_hi, base + PCI_MSIX_ENTRY_UPPER_ADDR);
	writel(msg->data, base + PCI_MSIX_ENTRY_DATA);

	if (unmasked)
		pci_msix_write_vector_ctrl(desc, ctrl);

	/* Ensure that the writes are visible in the device */
	readl(base + PCI_MSIX_ENTRY_DATA);
}

Now I'm not 100% sure about the cycle ordering rules.
I've not got the spec here, and I recall it isn't that easy to
understand.
I can't remember whether reads are allowed to overtake writes
(to different addresses).
I do remember that reading back the same address was needed to
flush the cpu store buffer on some space cpus.
So it might be that the final readl() won't actually flush
the write to the mask register.
(The same readback of 'the wrong' address also happens elsewhere.)

But there is a bigger problem.
As the comment says writing the address/data while an entry is
unmasked must be avoided (because a mixture of the old and new
values could easily by used for the PCIe write cycle).

But it is also quite likely that that hardware checks the masked
bit before/after reading the address+data.

So masking the interrupt immediately before the update and/or
unmasking immediately after could also cause issues.

I'd suggest adding a readl() of the mask register after the disable
and moving the readl() of the data to before the enable.
The delays inherent in the PCIe reads should be enough to ensure
that the interrupt generation logic doesn't run until all the
writes are complete.

(I'm also going to have to look at our fpga to see if the
global enable/mask bits are actually exported by the pcie block.
The MSI-X logic definitely doesn't have them as inputs.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-04-10 17:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-07 16:17 PCIe cycle sequence when updating the msi-x table David Laight
2023-04-07 19:55 ` Thomas Gleixner
2023-04-07 22:06   ` David Laight
2023-04-10  6:50     ` Thomas Gleixner
2023-04-10 12:16       ` David Laight
2023-04-10 17:25         ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox