* RE: revert bab65e48cb064 PCI/MSI Sanitize MSI-X checks
2023-04-06 15:07 ` revert bab65e48cb064 PCI/MSI Sanitize MSI-X checks Bjorn Helgaas
@ 2023-04-06 15:36 ` David Laight
2023-04-06 19:46 ` Thomas Gleixner
1 sibling, 0 replies; 3+ messages in thread
From: David Laight @ 2023-04-06 15:36 UTC (permalink / raw)
To: 'Bjorn Helgaas'
Cc: linux-kernel@vger.kernel.org, Thomas Gleixner, Jason Gunthorpe,
Bjorn Helgaas, Linus Torvalds, Christoph Hellwig,
linux-pci@vger.kernel.org, regressions@lists.linux.dev
From: Bjorn Helgaas
> Sent: 06 April 2023 16:08
>
> [+cc linux-pci, regressions]
>
> On Thu, Apr 06, 2023 at 11:05:14AM +0000, David Laight wrote:
> > The change in bab65e48cb064 breaks pci_enable_msix_range().
> > The intent is to optimise the sanity checks, but it is
> > somewhat overenthusiastic.
> >
> > The interface allows you to ask for a lot of vectors and
> > returns the number that were allocated.
> > However, after the change, you can't request a vector
> > that is higher than the largest the hardware supports.
> > Which makes that rather pointless.
> >
> > So code like:
> > for (i = 0; i < 16; i++)
> > msix_tbl[i].entry = i;
> > nvec = pci_enable_msix_range(dev, msix_tbl, 1, 16);
> > Now returns -22 if the hardware only supports 8 interrupts.
> >
> > Previously it returned 8.
> >
> > I can fix my driver, but I suspect that any code that relies
> > on a smaller number of vectors being returned is now broken.
>
> Thanks for the report! bab65e48cb06 ("PCI/MSI: Sanitize MSI-X
> checks") appeared in v6.2-rc1, so this is a recent regression and it
> would be good to fix it for v6.3.
I do try to test every release at around rc3.
> bab65e48cb06 only touches drivers/pci/msi/msi.c, but since it didn't
> go through the PCI tree, I'll let Thomas handle any revert (or better,
> an improvement to pci_msix_validate_entries()) since he wrote and
> applied the original.
Looking it:
static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *entries,
int nvec, int hwsize)
{
bool nogap;
int i, j;
if (!entries)
return true;
nogap = pci_msi_domain_supports(dev, MSI_FLAG_MSIX_CONTIGUOUS, DENY_LEGACY);
for (i = 0; i < nvec; i++) {
/* Entry within hardware limit? */
if (entries[i].entry >= hwsize)
return false;
/* Check for duplicate entries */
for (j = i + 1; j < nvec; j++) {
if (entries[i].entry == entries[j].entry)
return false;
}
/* Check for unsupported gaps */
if (nogap && entries[i].entry != i)
return false;
}
return true;
}
It probably needs to return an updated 'nvec'.
The gap/duplicate check is also a bit horrid, why not:
if (nogap) {
if (entries[i].entry != i)
return false;
continue;
}
if (!i || entries[i].entry > entries[i - 1].entry)
continue;
horrid, expensive loop...
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: revert bab65e48cb064 PCI/MSI Sanitize MSI-X checks
2023-04-06 15:07 ` revert bab65e48cb064 PCI/MSI Sanitize MSI-X checks Bjorn Helgaas
2023-04-06 15:36 ` David Laight
@ 2023-04-06 19:46 ` Thomas Gleixner
1 sibling, 0 replies; 3+ messages in thread
From: Thomas Gleixner @ 2023-04-06 19:46 UTC (permalink / raw)
To: Bjorn Helgaas, David Laight
Cc: linux-kernel@vger.kernel.org, Jason Gunthorpe, Bjorn Helgaas,
Linus Torvalds, Christoph Hellwig, linux-pci, regressions
On Thu, Apr 06 2023 at 10:07, Bjorn Helgaas wrote:
> On Thu, Apr 06, 2023 at 11:05:14AM +0000, David Laight wrote:
> Thanks for the report! bab65e48cb06 ("PCI/MSI: Sanitize MSI-X
> checks") appeared in v6.2-rc1, so this is a recent regression and it
> would be good to fix it for v6.3.
>
> bab65e48cb06 only touches drivers/pci/msi/msi.c, but since it didn't
> go through the PCI tree, I'll let Thomas handle any revert (or better,
> an improvement to pci_msix_validate_entries()) since he wrote and
> applied the original.
Right. The fix is trivial as the hardware size check in this validation
function is pointless.
The point is that for a range allocation with and entries array, _all_
entries up to max_vec must be correct independent of the actual hardware
size.
So the fix is simply removing the hardware size check from the
validation function.
The hardware size checking happens afterwards anyway.
Thanks,
tglx
---
drivers/pci/msi/msi.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -750,8 +750,7 @@ static int msix_capability_init(struct p
return ret;
}
-static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *entries,
- int nvec, int hwsize)
+static bool pci_msix_validate_entries(struct pci_dev *dev, struct msix_entry *entries, int nvev)
{
bool nogap;
int i, j;
@@ -762,10 +761,6 @@ static bool pci_msix_validate_entries(st
nogap = pci_msi_domain_supports(dev, MSI_FLAG_MSIX_CONTIGUOUS, DENY_LEGACY);
for (i = 0; i < nvec; i++) {
- /* Entry within hardware limit? */
- if (entries[i].entry >= hwsize)
- return false;
-
/* Check for duplicate entries */
for (j = i + 1; j < nvec; j++) {
if (entries[i].entry == entries[j].entry)
@@ -805,7 +800,7 @@ int __pci_enable_msix_range(struct pci_d
if (hwsize < 0)
return hwsize;
- if (!pci_msix_validate_entries(dev, entries, nvec, hwsize))
+ if (!pci_msix_validate_entries(dev, entries, nvec))
return -EINVAL;
if (hwsize < nvec) {
^ permalink raw reply [flat|nested] 3+ messages in thread